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We  are  happy  to  present  a  special  issue  of  the  ACES  Journal  on  Genetic  Algorithms  and  hope  readers  enjoy  the 
interesting  applications  presented.  The  first  paper  uses  a  GA  to  design  low  sidelobe  nonuniformly  spaced  arrays  over 
a  wide  bandwidth .  The  next  paper  shows  the  power  of  the  evolutionary  algorithms  applied  to  five  design  examples  in 
integrated  optics,  optical  communication  technology,  and  dielectric,  and  dielectric  material  modeling.  Paper  number 
three  explains  how  to  apply  a  genetic  algorithm  to  find  the  weightings  in  an  array  to  generate  a  plane  wave  in  the  near 
field  of  a  planar  array.  The  fourth  paper  shows  how  a  hybrid  GA/local  optimizer  can  reduce  the  number  of  function  calls 
needed  in  the  optimization  of  wire  antennas  via  a  method  called  clustering.  Afifth  paper  proves  that  a  parallel  GA  provides 
an  excellent  solution  to  the  problem  of  bandwidth  reduction  of  sparse  matrices  encountered  in  computational 
electromagnetics.  Finally,  a  controversial  paper  is  included  that  advocates  the  use  of  small  population  sizes  and 
relatively  high  mutation  rates  for  optimization  with  GAs.  We  would  like  to  thank  the  authors  for  their  response  to  this 
very  fast  publication  deadline. 


Introduction  to  Genetic  Algorithms  in  Electromagnetics 


Randy  L.  Haupt 
Utah  State  University 
Electrical  and  Computer  Engineering 
4120  Old  Main  Hill 
Logan,  UT  84322-4120 
Haupt@ieee.org 
435-797-2841 

This  special  issue  of  the  ACES  Journal  is  devoted  to  new  developments  in  Genetic  Algorithm  (GA)  applications  in 
computational  electromagnetics.  Genetic  Algorithms  have  become  extremely  popular  in  the  computational 
electromagnetics  literature.  The  papers  included  in  this  special  issue  are  very  arcane,  so  I  decided  to  include  an 
unreviewed  tutorial  overview  at  the  last  minute  as  an  introduction  for  those  of  you  who  are  at  a  more  basic  level. 
GAs  model  natural  selection  and  genetics  on  a  computer  to  optimize  a  wide  range  of  problems.  Some  of  the 
advantages  of  a  genetic  algorithm  include  that  it 

•  Optimizes  with  continuous  or  discrete  parameters, 

•  Doesn’t  require  derivative  information, 

•  Simultaneously  searches  from  a  wide  sampling  of  the  cost  surface, 

•  Deals  with  a  large  number  of  parameters, 

•  Is  well  suited  for  parallel  computers, 

•  Optimizes  parameters  with  extremely  complex  cost  surfaces, 

•  Provides  a  list  of  semi-optimum  parameters,  not  just  a  single  solution, 

•  May  encode  the  parameters  so  that  the  optimization  is  done  with  the  encoded  parameters,  and 

•  Works  with  numerically  generated  data,  experimental  data,  or  analytical  functions. 

These  advantages  have  inspired  many  people  working  in  computational  electromagnetics.  For  a  nice  historical 
development  of  applications  of  genetic  algorithms  in  electromagnetics,  see  [1]. 

A  genetic  algorithm  is  relatively  simple  compared  to  many  of  the  local  optimizers  used.  As  an  example,  consider  the 
very  simple  MATLAB  code  presented  in  [2]: 

%  This  is  a  simple  binary  GA 

N=8;  %  #  bits  in  a  chromosome 

M=16;  %  #  chromosomes 

last=20;  %  #  generations 
M2=M/2; 

%  creates  initial  population 
chromo=round(rand{M,N) ) ; 

for  ib=l:last 


************************ 

*  insert  subroutine  to  calculate 

*  objective  function  output 

*  cost=function (chromo) 

*  cost  is  a  Nxl  array 
************************ 


%  ranks  results  and  chromosomes 


[cost, ind] =sort (cost) ; 
chromo=chromo ( ind ( 1 : M2 } , : ) ; 


%mate 

cr=ceil ( (N-l ) *rand (M2 , 1 ) ) ; 

%  pairs  chromosomes 
%  performs  crossover 
for  ic=l : 2 :M2 

chromo (M2+ic, 1 :cr) =chromo (ic, 1 : cr) ; 
chromo (M2+ic , cr+1 :N) =chromo (ic+1, cr+1 :N) ; 
chromo (M2+ic+l, 1 :cr) =chromo (ic+1, 1 :cr) ; 
chromo (M2+ic+l , cr+1 :N) =chromo (ic, cr+1 :N) ; 
end 

%mutate 

ix=ceil (M*rand) ; 
iy=ceil (N*rand) ; 
chromo (ix, iy) =l-chromo (ix, iy) ; 

end  %last 

This  small  code  has  inspired  many  people  to  try  genetic  algorithms  and  is  given  to  students  taking  a  computational 
electromagnetics  course  at  Utah  State  University.  If  you  have  never  tried  a  GA,  then  this  one  is  a  good  starter 
program. 

Figure  1  shows  the  components  of  a  GA.  Compare  this  approach  to  a  typical  line  search  approach  shown  in  Figure 
2.  The  GA  usually  loses  to  a  local  optimizing  line  search  in  a  race  to  the  bottom  of  a  bowl.  On  the  other  hand,  the 
GA  has  the  ability  to  jump  out  of  a  bowl  into  another  bowl  within  the  search  area  whereas  a  line  search  is  much 
more  constrained.  Often  times  a  local  optimizer  is  worth  using  after  a  GA  finds  the  bowl  containing  the  desired 
minimum. 


Figure  1.  Flow  chart  of  a  genetic  algorithm.  Numerical  simulation  of  genetics  and  evolution  occurs  in  the  gray 
box. 


done 

Figure  2.  Flow  chart  of  a  typical  line  search  optimizer. 


A  GA  can  work  with  either  continuous  parameters  or  binary  encodings  of  the  continuous  parameters.  In  some  cases, 
the  parameters  are  naturally  binary.  In  either  case,  the  GA  begins  by  creating  a  random  set  of  parameters  called  a 
population.  Each  member  of  the  population  is  a  chromosome  and  contains  all  the  information  necessary  as  an  input 
to  an  objective  function  that  creates  an  output  of  interest.  This  first  part  is  a  random  search.  Next,  the  algorithm 
enters  the  gray  box  in  Figure  2.  Here,  parents  are  selected  to  generate  offspring  by  taking  part(s)  of  one  chromosome 
parent  selected  and  combining  with  part(s)  of  one  or  more  other  parents.  Natural  selection  occurs  by  weighting  the 
probability  of  a  chromosome  being  selected  as  a  parent  in  proportion  to  its  fitness.  Also,  inferior  solutions  or 
chromosomes  with  low  fitness  values  are  usually  discarded  from  the  population.  Finally,  random  mutations  are 
introduced  to  the  population  by  randomly  changing  parameter  values  or  bits  in  the  binary  encoding. 

For  the  reader  interested  in  pursuing  introductory  material  on  genetic  and  evolutionary  programming,  see  the  nice 
articles  by  Fogel  [3]  and  Holland  [4],  Goldberg  has  been  a  leader  in  the  field  and  his  book  [5]  is  an  excellent 
overview.  For  a  practical  introduction  with  a  more  tutorial,  handholding  approach  to  writing  and  using  GAs  see  [6]. 

[1]  D.S.  Weile  and  E.  Michielssen,  "Genetic  algorithm  optimization  applied  to  electromagnetics:  a  review,"  IEEE 
AP-S  Trans.,  Vol.  45,  No.  3,  Mar  97,  pp.  343-353. 

[2]  R.L  Haupt,  "An  introduction  to  genetic  algorithms  for  electromagnetics,"  IEEE  Antennas  and  Propagation 
Magazine,  Vol.  37,  No.  2,  Apr  95. 

[3]  D.B.  Fogel,  "Evolutionary  computing,"  IEEE  Spectrum,  Vol.  37,  No.  2,  Feb  00,  pp.  26-32. 

[4]  J.H.  Holland,  "Genetic  algorithms,"  Sci.  Am.,  Jul  92,  pp.  66-72. 

[5]  D.E.  Goldberg,  Genetic  Algorithms  in  Search,  Optimization,  and  Machine  Learning,  Reading,  MA:  Addison- 
Wesley,  1989. 

[6]  R.L.  Haupt  and  S.E.  Haupt,  Practical  Genetic  Algorithms,  New  York:  John  Wiley  &  Sons,  1998. 


THE  APPLIED  COMPUTATIONAL  ELECTROMAGNETICS  SOCIETY 

JOURNAL 

SPECIAL  ISSUE  ON  GENETIC  ALGORITHMS 


Vol.  15  No.  2  July  2000 

Editorial  -  Randy  L.  Haupt  and  J.  Michael  Johnson 

Introduction  to  Genetic  Algorithms  in  Electromagnetics  -  Randy  Haupt 


SPECIAL  ISSUE  PAPERS 

“A  Genetic  Algorithm  Optimization  Procedure  for  the  Design  of  Uniformly  Excited  and 


Nonuniformly  Spaced  Broadband  Low  Sidelobe  Arrays" 

B.J.  Barbisch,  D.H.  Werner  and  P.L.  Werner . 34 

“Application  of  Evolutionary  Optimization  Algorithms  in  Computational  Optics" 

D.  Emi,  D.  Wiesmann,  M.  Spuhler,  S.  Hunziker,  E.  Moreno,  B.  Oswald 

J.  Frohlich,  and  C.  Hafner . 43 

“Genetic-Algorithm  Optimization  of  an  Array  for  Near-Field  Plane  Wave  Generation" 

N.N.  Jackson  and  P.S.  Excell . 61 

“Increasing  Genetic  Algorithm  Efficiency  for  Wire  Antenna  Design  Using  Clustering” 

D.S.  Linden  and  R.  MacMillan . 75 

“A  Genetic  Approach  for  the  Efficient  Numerical  Analysis  of  Microwave  Circuits” 

L.  Tarricone . . . 87 

“Optimum  Population  Size  and  Mutation  Rate  for  a  Simple  Real  Genetic  Algorithm 
that  Optimizes  Array  Factors” 

R.L.  Haupt  and  S.  E.  Haupt . 94 


REGULAR  PAPERS 

“A  Novel  Preconditioning  Technique  and  Comparison  of  Three  Formulations  for 
Hybrid  FEM/MoM  Methods” 

Y.  Ji,  H.  Wang,  and  T.H.  Hubing . 103 

“A  New  Excitation  Model  for  Probe-Fed  Printed  Antennas  on  Finite  Size  Ground  Planes” 

F.  Tiezzi,  A.  Alvarez-Melcon,  and  J.R.  Mosig . 115 


©  2000,  The  Applied  Computational  Electromagnetics  Society 


34 


ACES  JOURNAL,  VOL.  15,  NO.  2,  JULY  2000  SI:  GENETIC  ALGORITHMS 


A  Genetic  Algorithm  Optimization  Procedure  for  the  Design  of  Uniformly  Excited  and 
Nonuniformly  Spaced  Broadband  Low  Sidelobe  Arrays 

Brian  J.  Barbisch 
The  Pennsylvania  State  University 
Applied  Research  Laboratory 
P.O.  Box  30 

State  College,  PA  16804-0030 
D.  H.  Werner 

The  Pennsylvania  State  University 
Department  of  Electrical  Engineering 
211 A  Electrical  Engineering  East 
University  Park,  PA  16802 
dhw@psu.edu 

P.  L.  Werner 

The  Pennsylvania  State  University 
Department  of  Electrical  Engineering 
University  Park,  PA  16802 
plw7@psu.edu 


ABSTRACT.  This  paper  presents  a  systematic  methodology 
for  designing  uniformly  excited  broadband  low  sidelobe 
linear  and  planar  antenna  arrays  by  varying  interelement 
spacings.  In  the  past ,  attempts  to  develop  a  robust  array 
broadbanding  design  technique  have  been  only  marginally 
successful  because  of  the  large  number  of  possible  spacing 
combinations  involved,  coupled  with  the  theoretical 
limitations  surrounding  the  problem.  The  genetic  algorithm 
(GA)  has  recently  proven  to  be  a  very  effective  design  tool  for 
nonuniformly  spaced  low  sidelobe  antenna  arrays  with 
uniform  excitation  intended  for  operation  at  a  single 
frequency.  This  paper  introduces  an  approach  for  extending 
previous  applications  of  GA  to  include  the  design  of  optimal 
low  sidelobe  arrays  that  are  operable  over  a  band  of 
frequencies.  In  addition,  it  will  be  demonstrated  that 
designing  for  low  sidelobe  operation  over  a  bandwidth  adds 
significant  array  steerability  that  can  be  described  by  a 
simple  mathematical  relation.  Finally,  it  will  be  shown  that 
the  GA  objective  function  is  no  more  complicated  to  evaluate 
for  broadbanding  purposes  than  it  is  in  the  single  frequency 
case.  Several  examples  of  GA~designed  broadband  low 
sidelobe  arrays  will  be  presented  and  discussed. 

1.  Introduction 

In  recent  years,  genetic  algorithms  have  found  a  fairly 
strong  presence  in  electromagnetics  optimization  problems 
involving  antenna  design.  The  difficulty  in  solving  many 
antenna  design  problems  is  that  very  often  there  are  many 
parameters  and  no  practical  analytical  methods  available  to 
optimally  determine  them.  Such  difficulties  make  robust 


search  strategies,  like  genetic  algorithms,  very  important  The 
main  advantages  of  using  the  GA  over  other  search  strategies 
are:  1)  the  GA  can  search  from  any  number  of  random  points 
to  find  a  solution,  2)  the  GA  works  with  a  coding  of  the 
parameters  and  not  the  actual  parameters,  3)  GA’s  use 
random,  not  deterministic,  transition  rules,  and  4)  the  GA 
does  not  require  the  evaluation  of  derivatives  [1].  Several 
books  have  been  written  which  discuss  genetic  algorithms  and 
demonstrate  many  useful  applications  [2-4],  Among  the  first 
applications  of  genetic  algorithms  in  antenna  design  was  the 
thinning  of  large  arrays  [1].  Some  other  varieties  of  antenna 
arrays  to  which  the  GA  has  been  applied  include  planar  arrays 
[1,5],  multiple  beam  arrays  [6],  and  Yagi-Uda  arrays  [7]. 
There  have  also  been  several  excellent  review  articles  and 
books  written  about  GA’s  and  their  application  to  solving 
complex  engineering  electromagnetics  problems  [4],  [8-11]. 

The  capability  of  GA’s  to  produce  optimal  low  sidelobe 
designs  for  linear  arrays  of  uniformly  excited  isotropic 
sources  (at  a  single  frequency)  by  allowing  only  the 
interelement  spacings  to  vary  was  first  demonstrated  in  the 
pioneering  work  of  [8].  Interelement  spacings  were  decided 
by  using  a  3  bit  parameter  such  that  they  could  vary  in 
increments  of  X/8  with  a  minimum  interelement  spacing  of 
7JA.  In  this  paper,  we  will  demonstrate  that  GA’s  are  also  an 
extremely  usefiil  tool  for  broad-banding  of  uniformly  excited, 
unequally  spaced  antenna  arrays.  There  are  three  major 
advantages  of  the  technique  employed  in  this  paper  when 
compared  to  previously  published  methods,  such  as  those 
described  in  [8].  These  advantages  are  that  1)  a  much  finer 
discretization  (~  ±  0.01  A )  will  be  used,  2)  the  GA-designed 
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arrays  will  have  minimal  sidelobes  over  a  band  of  frequencies 
instead  of  at  just  one  frequency,  and  3)  these  arrays  will 
typically  have  a  much  wider  angular  region  over  which  the 
main  beam  can  be  steered  compared  to  those  optimized  for 
low  sidelobe  performance  at  a  single  frequency.  The  steady- 
state  genetic  algorithm  with  uniform  crossover  [12]  was 
chosen  for  use  in  optimizing  the  array  designs  discussed  in 
this  paper. 

Although  many  traditional  analytical  techniques  exist  for 
placing  elements  in  unequally  spaced  arrays  for  broad¬ 
banding  purposes,  viz .  [13-15],  none  of  these  methods  are 
capable  of  producing  significantly  low  sidelobe  levels  over 
the  entire  band.  The  focus  of  many  of  these  methods  is  to 
place  elements  in  an  array  such  that  the  minimum  separation 
between  elements  is  greater  than  or  even  much  greater  than  a 
wavelength.  The  advantage  of  such  large  interelement 
spacings  is  that  a  larger  bandwidth  can  be  achieved  because 
a  lower  minimum  frequency  is  possible.  The  disadvantage, 
however,  is  that  a  theoretical  lower  bound  exists  on  the 
sidelobe  level  when  average  interelement  separations  exceed 
a  wavelength  [16].  This  theoretical  minimum  is  usually  not 
low  enough  for  practical  applications.  Keeping  in  mind  this 
theoretical  limitation,  a  design  optimization  technique  will  be 
introduced  in  this  paper  which  attempts  to  place  elements 
such  that  the  average  interelement  spacing  in  the  array  is 
always  less  than  a  wavelength. 

Another  important  consideration  in  the  design  of  antenna 
arrays  is  their  steerability.  Broad-band  arrays  have  the 
property  that  they  may  exhibit  perfect  steerability  at  lower 
frequencies  of  operation,  but  steerability  is  reduced  when 
moving  to  higher  frequencies.  The  fact  that  steerability 
changes  with  frequency  can  be  quantified  by  the  bandwidth- 
steerability  product  of  the  array  [16]. 

A  useful  conversion  factor  will  be  introduced  in  Section  2 
that  permits  design  tradeoffs  to  be  made  between  bandwidth 
and  (minimum)  element  separation.  Steerability  issues  will 
also  be  briefly  discussed  in  Section  2.  Section  3  begins  by 
considering  an  example  of  an  optimized  low-sidelobe  array 
design  intended  for  operation  at  a  single  frequency. 
Following  this,  four  examples  of  genetic  algorithm  produced 
broadband  low  sidelobe  array  designs  are  presented  and 
discussed-  two  linear  arrays  (Section  3)  and  two  planar  arrays 
(Section  4).  In  addition,  the  GA  objective  function  used  to 
produce  each  design  is  given  in  the  respective  sections.  All 
array  designs  considered  in  this  paper  were  specified  to  have 
a  maximum  possible  bandwidth  with  a  minimum  element 
separation  of  A/4  and  the  lowest  possible  sidelobe  level 
throughout  the  band. 


2.  Some  Considerations  for  Broad-Banding 
Arrays 

In  designing  a  broadband  array  for  low  sidelobe 
performance,  it  is  sufficient  to  design  for  the  highest  desired 

frequency  of  operation  f2 .  Having  done  this,  the  frequency 
may  then  be  varied  from  f2  to  any  fx ,  provided  fx  <  f2 , 
without  the  appearance  of  any  higher  sidelobes.  The 
bandwidth  for  such  an  array  is  defined  to  be  B  =  f2  /  fx . 
Furthermore,  we  note  that  if  a  minimum  separation  between 
two  elements  exists  at  the  lowest  design  frequency  fx  that  is 

considered  too  small  for  practical  purposes,  then  that  spacing 
can  be  made  larger  at  the  expense  of  a  smaller  bandwidth 

B  <B  (i.e.,  f2  <  f2 ).  This  property  is  best  illustrated  by 
the  following  useful  transformation: 


where 


S  =  the  set  of  original  element  locations 
{sn=dnXl:n  =  \,2,...,N} 

S  =  the  set  of  new  element  locations 

Hence,  the  array  configuration  need  only  be  optimized  for 
a  desired  maximum  bandwidth  B,  subject  to  some  specified 
tolerance  on  the  minimum  element  separation.  Once  this 
optimal  array  design  has  been  found  using  the  GA  then,  if 
desired,  the  transformation  given  in  (1)  may  be  employed  to 
find  modified  designs  which  tradeoff  larger  element 
separations  for  smaller  operating  bandwidths. 

Another  notable  characteristic  of  broadband  arrays  is  how 
steerability  is  affected  with  increasing  bandwidth.  It  can  be 
shown  that  the  bandwidth  and  steerability  of  a  linear  array  are 
related  by  the  following  formula,  which  is  known  as  the 
bandwidth-steerability  product  [15]: 

B(l  +  COS  d0  )  =  —3^ -  (2) 

u  a  . 

ave  min 

where 

B  =  bandwidth 
00  =  steering  angle 

w0  =  the  maximum  value  of  ( [dm  /  X)  COS  6  that 

can  be  used  before  a  sidelobe  will  exceed  the 
desired  sidelobe  level 

dave  =  average  interelement  separation  in  the  array 
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dmin  =  smallest  interelement  separation  in  the  array 

The  right-hand  side  of  this  equation  is  a  constant,  and  is 
characteristic  of  the  individual  array.  Note  that  at  a  1:1 
bandwidth,  the  right-hand  side  of  the  equation  must  be  at  least 
two  to  guarantee  perfect  steerability.  When  a  bandwidth  of 
larger  than  1:1  is  desired,  the  left-hand  side  of  this  equation 
limits  steerability  at  some  of  the  higher  frequencies  in  the 
band.  Thus,  while  a  broadband  array  may  exhibit  perfect 
steerability  at  lower  frequencies  of  operation,  steerability  may 
become  limited  at  higher  frequencies  of  operation.  In 
addition,  arrays  designed  to  operate  at  only  one  frequency 
when  interelement  spacings  are  small  may  not  exhibit  any 
steerability. 

3.  Linear  Broadband  Array  Designs 


The  array  factor  expression  for  the  far-field  radiation 
pattern  of  a  symmetric  linear  array  of  isotropic  sources  can  be 
written  in  the  following  form: 

AF{6)  =  cos[2/r  (/  /  /,  )dn  cos  $  +  an]  (3) 

n=l 

where 

an  =  -27u(f  /  /j  )dn  cos  e0  (4) 

and 

2N  =  the  total  number  of  elements  in  the  array 
ln  =  excitation  current  amplitude  of  the  nth  element 
in  the  array 

(Xn  =  excitation  current  phase  of  the  nth  element  in 
the  array 

5  =  d  „  X,  =  total  distance  of  the  nth  element  from 
n  n  i 

the  origin  (note  that  the  parameter  dn  is 
unitless) 

6  =  angle  measured  from  the  line  passing  through 

antenna  elements 
0Q  =  steering  angle 


fx  =  base  (minimum)  frequency  of  operation 


/  =  desired  frequency  of  operation 
The  objective  function  used  by  the  GA  in  this  paper  is  based 
on  the  array  factor  expression  given  in  (3),  where  the  desired 
goal  is  to  minimize  the  maximum  relative  sidelobe  level 
(RSLL)  of  the  array  over  some  prescribed  bandwidth.  In 
other  words,  each  gene  has  an  associated  RSLL  calculated 
from 


F{6)  =max 


N 

2i‘. 


cos[2nBdn  cos#] 


n- 1 


(5) 


where 


AFm2X  (0)  =  peak  of  the  main  beam  (for 
normalization) 

B  =  f2  /  f\  =  desired  bandwidth  of  the  array 

( fi  —  f\ ) 

The  parameters  dn  were  selected  by  the  GA  to  minimize  the 
maximum  sidelobe  level  with  I n  set  to  unity  for  all  values  of 

n.  The  discretization  of  dn  was  made  relatively  fine,  such 

that  it  could  be  varied  in  increments  of  approximately  ±0.01 
between  zero  and  some  maximum  selected  value.  The  use  of 
any  finer  discretization  was  found  to  yield  little  improvement 
in  the  overall  results.  It  was  also  found  that,  in  the  case  of 
broadband  array  optimization  ( B>1 ),  the  GA  objective 
function  need  not  be  any  more  complicated  to  evaluate  than 
it  is  for  optimization  of  array  performance  at  a  single 
frequency  (B=l ).  This  is  one  of  the  attractive  features  of  the 
technique  presented  here,  since  it  means  that  the  overall 
design  optimization  time  required  by  the  GA  will  be 
essentially  the  same  regardless  of  whether  single-frequency  or 
broadband  array  configurations  are  being  considered. 

Previous  attempts  to  design  low  sidelobe  linear  antenna 
arrays  using  the  GA  have  been  limited  to  operation  at  a  single 
frequency  (i.e.,  for  £=i)  [1,5].  The  GA  approach  introduced 
in  this  paper  is  also  able  to  produce  low  sidelobe  designs  for 
B=1  as  a  special  case  of  a  more  general  procedure  which  is 
valid  for  B>1.  For  example,  given  a  uniformly  excited  40 
element  array  with  a  minimum  element  separation 
requirement  of  a  quarter- wavelength  (i.e., 

A „  =  ( dn+l  -dn)>  0.25  V  n  =  1,2,...,  N- 1 ),  the  GA 

was  able  to  generate  an  array  with  maximum  sidelobe  levels 
as  low  as  -28.86  dB  (see  Figure  la).  Figure  lb  shows  the 
array  factor  of  the  same  array  with  the  main  beam  steered 

from  90  0  (broadside)  to  91 0 .  Notice  that  steering  the  beam 

by  even  such  a  small  amount  as  1°  in  this  case  causes 
sidelobes  to  rise  above  the  broadside  maximum  sidelobe  level 
of  -28.86  dB.  This  property  is  a  direct  consequence  of  the 
fact  that  the  array  is  not  designed  to  operate  over  a  significant 
bandwidth,  as  predicted  by  the  bandwidth-steerability  product 
(2).  It  will  be  demonstrated  in  this  paper,  however,  that 
significant  steerability  is  possible  for  broadband  arrays  where 
B>1. 

The  first  broadband  design  that  will  be  considered  is  also 
a  uniformly  excited  40  element  array.  The  minimum  element 
separation  requirement  will  be  a  quarter-wavelength  at  the 
lowest  design  frequency  ( /  =  fx ).  In  this  case,  the  GA  was 
able  to  optimize  interelement  spacings  so  that  a  bandwidth  of 
B  =  3.5  was  possible  for  broadside  operation  with  a  maximum 
sidelobe  level  of  -19.41dB  throughout  the  entire  band  (see 
Table  1).  Figures  2a-2c  show  plots  of  the  array  factor  at  the 
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low-band  (/  =  /j),  mid-band  (/  =  ( fx  +/2)/ 2),  and 

high-band  (  /  =  /2)  design  frequencies.  The  radiation 
patterns  of  an  un-optimized,  uniformly  spaced  40  element 
anray  at  the  same  three  frequencies  are  shown  in  Figures  3a-3c 
for  comparison  purposes.  Note  that,  as  expected,  the 


e 

(a) 


e 

(b) 

Figure  1.  Array  factor  for  a  uniformly  excited  and 
nonuniformly  spaced  40  element  linear  array  of  isotropic 
sources  with  (a)  a  broadside  mainbeam  and  (b)  the  mainbeam 

steered  to  91  (one  degree  from  broadside).  The  maximum 
sidelobe  level  at  broadside  is  -28.86  dB  with  a  bandwidth  of 

b  =  /2//,=  i 


maximum  sidelobe  level  under  these  conditions  is  about -12.5 
dB.  In  addition,  Figures  4a-4c  show  the  optimized  40 
element  array  from  Figures  2a-2c  with  the  main-beam  steered 
to  60°  also  at  low-band,  mid-band,  and  high-band  design 
frequencies.  These  figures  demonstrate  that,  for  the  low-band 

(/  =  fi )  and  mid-band  ( /  =  2.25  fx )  frequencies,  it  is 
possible  to  steer  the  main-beam  to  60°  without  any  increase  in 
the  synthesized  sidelobe  level.  However,  for  the  high-band 

frequency  ( /  =  3.5/j ),  we  see  that  the  synthesized  sidelobe 
level  can  no  longer  be  maintained  when  the  beam  is  steered 


to  60°.  Further  investigation  reveals,  as  predicted  by  (2),  that 
there  is  almost  no  steerability  for  this  array  when 

f  —  fi~  3.5  fx .  The  maximum  frequency  at  which  this 
array  exhibits  perfect  steerability  was  found  to  be 

/  =  1.75/,. 

Table  1.  Element  separations  at  f  —  fx  for  the  GA-opdmized  40  element  linear  array  (see  Figures  2a- 2c). 


Element 

Number 

00 

Element 

Separation 

(»./*,) 

Element 

Number 

00 

Element 

Separation 

(sJl) 

Element 

Number 

00 

Element 

Separation 

(*./*,) 

Element 

Number 

00 

Element 

Separation 

(»./■*,) 

1 

0.125 

6 

1.375 

11 

2.625 

16 

3.955 

2 

0.375 

7 

1.625 

12 

2.885 

17 

4.215 

3 

0.625 

8 

1.875 

13 

3.135 

18 

4.475 

4 

0.875 

9 

2.125 

14 

3.435 

19 

5.625 

5 

1.125 

10 

2375 

15 

3.695 

20 

6.485 

Table  2.  Element  separations  at  /  =  /,  for  the  GA-opdmized  100  element  linear  array  (see  Figure  5). 


Number 

oo 

Separation 

(sJl) 

Element 

Number 

<n) 

Element 

Separation 

Element 

Number 

(n) 

Element 

Separation 

(i./A) 

Element 

Number 

(n) 

n«n»ni 

Separation 

(*./<*,) 

1 

0.125 

14 

3.375 

27 

6.865 

40 

11.355 

2 

0.375 

15 

3.625 

28 

7.115 

41 

11.615 

3 

0.625 

16 

3.875 

29 

7.365 

42 

12105 

4 

0.875 

17 

4.125 

30 

7.615 

43 

12355 

5 

1.125 

18 

4.375 

31 

7.865 

44 

13.115 

6 

1.375 

19 

4.625 

32 

8.115 

45 

13.365 

7 

1.625 

20 

4.875 

33 

8.365 

46 

14.095 

8 

1.875 

21 

5.365 

34 

9.095 

47 

14.345 

9 

2.125 

22 

5.615 

35 

9.345 

48 

14.595 

10 

2.375 

23 

5.865 

36 

9.595 

49 

15.085 

11 

2.625 

24 

6.115 

37 

9.845 

50 

15.585 

12 

2875 

25 

6.365 

38 

10.105 

13 

3.125 

26 

6.615 

39 

10.865 

Larger  sized  arrays  were  found  to  be  capable  of  producing 
wider  bandwidths.  For  example,  using  A^  /  4  as  the  minimum 
nterelement  separation  for  a  100  element  array,  it  was 
possible  to  optimize  the  array  configuration  using  the  GA  to 
yield  a  bandwidth  of  B  =  3.97  and  a  maximum  sidelobe  level 
of  -20.32  dB  (see  Figure  5  and  Table  2).  The  maximum 
frequency  at  which  this  array  is  steerable  to  60°  is  at  j±2.64fh 
and  the  maximum  frequency  at  which  this  array  exhibits 
perfect  steerability  is  at/=  1. 98fj  (see  Figure  6).  If  larger 
interelement  spacings  are  desired,  then  (1)  may  be  used  to 
determine  the  corresponding  reduction  in  bandwidth  that 

would  result.  For  instance,  increasing  the  Aj  /  4  minimum 

interelement  separation  to  0.49625  A^  reduces  the  bandwidth 
from  3.97  down  to  2.  On  the  other  hand,  the  bandwidth  of  the 
array  can  be  doubled  to  B  =  7.94  by  allowing  the  minimum 
element  separations  to  be  as  small  as  A^  /8 .  Reducing  the 
minimum  element  separation  to  8  also  doubles  the 
bandwidth  over  which  the  array  is  steerable  -  i.e.,  in  this  case 
the  above  array  would  be  perfectly  steerable  over  a  bandwidth 
of  B  =  3.96  with  a  maximum  sidelobe  level  of  -20.32  dB. 
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(b) 


6 

(c) 


Figure  2.  Plots  of  the  array  factor  for  an  optimized  broadside 
(dQ  =90°)  uniformly  excited  and  nonuniformly  spaced  40 
element  linear  array  of  isotropic  sources  at  (a)  f  jfx  =  1 ,  (b) 
f/fx  =  2.25  ,  and  (c)  f  jfx  =  3.5  .  The  maximum  bandwidth 
for  this  array  is  B  =  f2/f\  =  3.5  . 
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(a) 


(C) 

Figure  3.  Plots  of  the  array  factor  for  a  broadside 
(i 90  =  90° )  uniformly  excited  and  uniformly  spaced  40 
element  linear  array  of  isotropic  sources  at  (a)  f  ffx  =  1 ,  (b) 
///,=  2.25,  and  (c)  f/fx=  3.5. 

4.  Planar  Broadband  Array  Designs 

The  GA  optimization  procedure  described  in  the  previous 
section  for  broadbanding  linear  arrays  will  be  generalized  in 
this  section  to  include  planar  array  configurations.  In 
particular,  the  GA  design  approach  will  be  developed  for 
rectangular  arrays  as  well  as  for  concentric  circular  arrays 
with  variable  element  spacings.  The  array  factor  for  a  non- 
uniformly  spaced  symmetric  rectangular  array  of  isotropic 
sources  may  be  represented  in  the  following  form: 
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N  M 

AF(0, =  Imn  cos [2ml m (///,) sin 6 cos (f\ 


n=l  m= 1 

•cos  (/  /  /j ) sin  6 sin  <p] 


(6) 

where 

2M  =  total  number  of  elements  in  the  y-direction 
2N  =  total  number  of  elements  in  the  x-direction 

sxn  =dxn^ i  =  element  locations  in  the  x-direction 
with  respect  to  the  origin 

sym  =  dym^i  =  element  locations  in  the  y-direction 
with  respect  to  the  origin 

The  corresponding  RSLL  in  this  case  is  calculated  from 

cos[2 nd^B  sin  9  cos  0]  cos  [2 xdymB  sin  9  sin  <j> ] 
^maxC^) 

(7) 

The  GA  uses  (7)  to  determine  the  set  of  parameters  d 
and  dym  that  yields  the  lowest  possible  sidelobe  level  over  a 
specified  bandwidth  B,  assuming  that  the  array  is  uniformly 
excited  (i.e.,  I ^  =  1  for  all  values  of  m  and  n).  In  order  to 

accomplish  this,  a  spacing  scheme  was  designed  such  that 
rows  and  columns  were  treated  the  same.  For  example,  the 
GA  selects  a  set  of  spacings  S  =  {dl,d2,d3,...,dn}, 
where  n  is  the  number  of  rows  and  columns  in  the  array  (i.e., 
N  =  M  =  n  where  N  and  M  are  from  (6)).  The  number 
dt ,  1  <  i  <  n ,  then  represents  the  interelement  spacing 

between  elements  (i9j)  and  —  and  the  elements 
(jj)  and  (7,1—1)  V/  3  1  <  j  <n,  where  the  indices 

(0,  j)  and  (7, 0)  are  the  y  and  x  axes  respectively.  This 
scheme  makes  the  objective  function  very  simple  to  evaluate 
because  the  maximum  sidelobe  level  will  always  be  located 

in  the  (j)  =  0°  and  (j)  =  90°  planes.  Figures  7a-7c  show 

radiation  pattern  cuts  at  (j)  =  0° ,  (j)  —  45° ,  and  <p  =  90° , 

respectively,  for  a  4,096  element  array  which  was  designed  to 
produced  a  bandwidth  of  B  =  3,5  and  a  maximum  sidelobe 
level  of  -19.41dB  with  a  minimum  specified  element 
separation  of  ^  /  4. 

Next  we  will  consider  an  alternative  design  optimization 
approach  based  on  concentric  circular  arrays  which  results  in 
a  more  spatially  uniform  distribution  of  sidelobes.  The  RSLL 
in  this  case  is  calculated  from 


N  M 

F{9 , 0)  =  max]  4^  Y  / 
n=\  m=l 


F{6,(j))  =  max 


M  Nm 


XX7™  eXP \j2^am  sinflcos ty-Qn^  +  jCCmni 


m= 1  n- 1 


AFmax(0,0) 


where 

m-\  n=l 

•  exp[/2;r(/ / /, )am  sin 0 cos(</> -<pmn)  +  ja^ ] 


(8) 


(9) 

amn  =  -2 X(f  /  /j )  sin  0O  cos(^0  -  ^ )  (10) 

and 

rm  ~  am  ^ i  =  radius  of  the  mth  ring  array 
M  =  total  number  of  concentric  ring  arrays 
iVm  =  total  number  of  elements  in  the  mth  ring 
The  spacing  scheme  was  designed  such  that  elements  were 
placed  on  arcs  spaced  4  apart,  where  corresponds  to 
the  wavelength  at  the  lowest  design  frequency  In 

addition  to  this,  the  elements  in  each  quadrant  were  assumed 
to  be  arranged  symmetrically  about  their  respective  diagonal 
axis  (e.g.,  the  elements  in  the  first  quadrant  are  symmetric 

with  respect  to  the  (f)  =  45°  axis).  Figure  8  shows  one 
quadrant  of  a  concentric  circular  array  in  the  equally  spaced 
case  that  could  be  constructed  using  this  spacing  scheme.  For 
this  example,  the  minimum  arc  length  between  any  two 
consecutive  array  elements  was  set  to  /  4.  Figure  9  shows 

the  radiation  pattern  produced  over  the  </>  —  45°  cut  by  a  308 

element  concentric  circular  array  with  element  spacings 
optimized  to  yield  a  bandwidth  of  B  =  3.5  with  a  maximum 
sidelobe  level  of -21.91dB  throughout  the  band. 
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Figure  4.  Plots  of  the  array  factor  for  an  optimized 
uniformly  excited  and  nonuniformly  spaced  40  element  linear 

array  of  isotropic  sources  with  60  =  60°  at  (a)  f/fx  =1 , 
(b)  f/fi  =2.25,  and (c)  f/fx=  3.5. 
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Figure  5.  Array  factor  for  an  optimized  uniformly  excited 
and  nonuniformly  spaced  100  element  linear  array  of 
isotropic  sources  with  f/f}  =  3.97  . 


Figure  6.  Array  factor  for  an  optimized  uniformly  excited 
and  nonuniformly  spaced  100  element  linear  array  of 

isotropic  sources  with  6  =  0O  and  f/fx  =  1.98 . 
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Figure  7.  Radiation  pattern  cuts  at  0  =  0°  (Figure  7a), 
0  =  45°  (Figure  7b),  and  (p-90°  (Figure  7c)  of  an 
optimized  4,096  element  square  planar  array  with 
f/fi  =  3.5  . 
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5.  Conclusions 

Uniformly  excited  array  broad-banding  has  been  achieved 
using  a  genetic  algorithm  optimization  procedure  with 
bandwidths  as  large  as  B  =  3.97  for  linear  arrays  and  B  =  3.5 
for  planar  arrays  with  a  minimum  element  separation  of 
/li/4.  Minimum  element  separation  can  easily  be  made 
larger  to  avoid  mutual  couplifig  effects,  or  it  can  be  made 
smaller  to  increase  bandwidth  by  using  the  convenient 
conversion  factor  given  in  (1).  Array  steerability  issues  have 
also  been  addressed  in  this  paper.  Steerability  varies  with 
operation  frequency  as  predicted  by  (2)  -  it  is  greater  at  lower 
frequencies  of  operation  and  lesser  at  higher  frequencies  of 
operation.  In  addition,  the  bandwidth  over  which  the  array  is 
steerable  improves  proportionally  as  (1)  is  used  to  increase 
bandwidth.  It  should  also  be  noted  that  in  order  to  include 
steerability  within  the  optimization  scenario  (in  the  sense  of 
a  multi-objective  constraints  synthesis  procedure),  we  could 
adopt  a  more  general  definition  of  the  objective  function  that 
includes  the  right-hand  side  of  (2).  Finally,  we  point  out  that 
even  lower  sidelobe  levels  might  be  achieved  in  some  cases 
by  including  the  element  pattern  in  the  optimization  scheme. 


X 


Figure  8.  One  quadrant  of  an  equally  spaced  concentric 
circular  ring  array  that  is  arranged  symmetrically  about  its 
diagonal  axis. 


Figure  9.  Radiation  pattern  cut  at  (j)  =  45°  for  an  optimized 
broadband  concentric  circular  ring  array  with  f  j  fx  =  3.5  . 
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ABSTRACT  -  The  spatial  and  spectral  treatment  of 
electromagnetic  fields  express  an  essential  operation 
regarding,  e.g.,  the  functionality  of  dense  integrated 
optical  devices.  Such  molding  of  fields  can  hardly  be 
handled  without  sophisticated  heuristic  optimization 
tools.  By  means  of  five  design  examples  we  have 
demonstrated  that  evolutionary  algorithms  (EA)  are 
highly  qualified  to  solve  “real  world”  inverse  problems 
considering  various  applications  in  the  field  of  planar 
integrated  optics,  optical  communication  technology, 
and  dielectric  material  modeling  as  well.  In  com¬ 
parison  to  other  optimization  schemes  EAs  are  even 
able  to  deliver  structural  and  temporal  information  of 
the  device  under  optimization  which  is  an  important 
feature  when  targeting  computer  guided  engineer¬ 
ing  and  virtual  design  platforms. 

1.  INTRODUCTION 

Evolutionary  algorithms  (EA)  [1]  are  computer 
codes  which  emulate  the  search  process  of  natural 
evolution.  This  class  of  optimization  algorithms  rests 
upon  the  collective  learning  process  within  a 
population  of  individuals,  each  of  which  represents  a 
search  point  in  the  space  of  potential  solutions  to  the 
given  problem.  Because  of  an  implicit  parallelism  in 
the  search  behavior  they  avoid  the  common  pitfalls  of 
local  optimization  algorithms,  but  hold  the  promise  of 
finding  novel  solutions  perhaps  not  thought  to  exist. 

The  latter  aspect  -  i.e.,  the  structural  optimization 
feature  -  has  successfully  been  applied  to  several 
different  types  of  design  problems  in  planar  integrated 
optics  [2],  such  as  single  longitudinal  mode  multi¬ 
cavity  laser  diodes  [3],  [5]-[10],  ultra-short  non¬ 
periodic  segmented  spot-size  converters  for  highly 


efficient  chip-to-fiber  coupling  [9]-[13]  and 
concatenated  Bragg  gratings  for  apodized  add/drop 
filters  in  wavelength  division  multiplexing  (WDM) 
network  nodes  [14].  In  earlier  contributions  [15],  [16], 
evolutionary  algorithms  have  also  been  considered  as 
very  efficient  regarding  their  parameter  estimation 
features  in  the  context  of  speeding  up  costly 
computational  electromagnetics  simulations.  They  have 
also  been  applied  when  optimizing  frequency  channel 
distributions  in  fiber  optic  SCM-links  [17]  and  for  the 
determination  of  analytical  dispersion  models  for 
complex  and  highly  lossy  dielectric  materials  [18]. 

In  the  paper  presented  here,  we  will  outline  all 
design  examples  mentioned  above.  Therefore,  the 
remainder  of  the  paper  is  organized  as  follows:  In 
Section  2,  we  briefly  explain  our  special  type  of 
evolutionary  algorithm  which  is  then  used  for  the 
optimization  of  an  active  waveguide  device  namely  a 
non-periodic  coupled-cavity  semiconductor  laser 
diode.  Section  3  is  dedicated  to  the  design  of  realistic 
apodized  concatenated  Bragg  gratings  as  highly 
selective  add/drop  filters  for  wavelength  division 
multiplexing  (WDM)  applications.  The  spatial 
treatment  of  guided  modes  by  a  non-periodically 
segmented  waveguide  structure  leading  to  a  very 
compact  and  efficient  spot-size  converter  is  reported 
in  Section  4.  Section  5  describes  the  optimization  of 
frequency  channel  distributions  in  fiber  optic  SCM- 
links  and  the  determination  of  an  analytical  dielectric 
material  model  is  given  in  Section  6. 

After  these  elucidations,  a  brief  outlook  is  given, 
focusing  on  some  algorithmic  prospects  (Section  7) 
and  tracing  two  aspects  towards  computer  guided 
engineering  (Section  8)  as  well.  We  conclude  our 
contribution  with  a  short  summary  in  Section  9. 
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2.  MULTI-CAVITY  LASER  TOPOLOGIES 

An  economically  priced  monolithic  GaAs/AlGaAs 
laser  diode  with  an  emission  wavelength  around 
852  nm  represents  an  attractive  light  source  for  low- 
cost  high-precision  time  and  distance  metrology.  Such 
single-longitudinal-mode  laser  operation  usually  relies 
on  distributed  Bragg  reflector  ( DBR )  laser  topologies 
or  distributed  feedback  ( DFB )  lasers  respectively. 
Both  utilize  a  fine-scale  grating  mostly  having  periods 
on  the  orders  of  a  few  hundred  nanometers.  This  puts 
high  demands  even  on  the  state-of-the-art  lithographic 
reproduction,  resulting  in  very  high  costs. 

In  order  to  focus  on  simple  laser  processing,  we 
restrict  our  design  to  large-scale  non-periodic  per¬ 
turbations  in  the  form  of  multi-section  cavity 
structures.  Such  irregular  topologies  are  now  to  be 
optimized  with  respect  to  given  laser  specifications. 

The  type  of  breeder  genetic  algorithm  (see  also 
[4])  presented  here  works  on  fixed-length  bit-strings. 
It  starts  by  initializing  a  population  of  N  =  50  bit- 
strings  randomly.  Then  the  population  evolves  by 
using  probabilistic  genetic  operators  for  reproduction 
purposes.  Within  this  frame,  two  parent-strings  are 
selected  by  the  /zmess-proportional  roulette-wheel 
selection  process.  Two  off-spring  are  then  generated 
using  two-point  crossover  and  mutation.  Referring  to 
the  forward  problem  a  laser  simulator  is  activated, 
delivering  all  characteristic  data  needed  for  the  quality 
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Fig.l:  Representation  of  the  non-periodic  multi-cavity 
laser  structure  (phenotyp)  by  a  5-valued  integer 
string  (genotyp)  including  contact  electrodes  for 
current  injection. 

rating  of  each  off-spring.  After  judging  the  quality 
( fitness )  of  these  new  individual  two  advantageous 
aspects  of  our  implementation  should  be  mentioned  [5], 
[6]:  1.)  every  new  individual  is  checked  whether  it  is 
already  included  in  the  population.  Allowing  no 
duplicates  guarantees  a  certain  diversity  and  avoids 
premature  convergence.  2.)  only  better  individuals  than 
the  worst  enclosed  in  the  population  are  inserted,  e.g.,  a 
strict  breeding  is  done.  The  whole  reproduction 
process  defines  a  loop  which  is  carried  out  until  the 
number  of  calculated  individuals  reaches  a  certain 
predefined  value. 


Fig.2:  Best  performing  laser  solution,  a)  The  effective  refractive  index  distribution  along  the  cavity  shows  59 
sections  at  a  total  length  of  730  pm.  The  position  of  current  injection  is  sketched  by  its  corresponding  electrode 
( labeled  as  a  bold  line),  b)  Corresponding  round-trip  gain  spectrum  Gn .  Lasing  occurs  at  the  circle,  all  round- 
trip  phase  zeros  are  marked  with  dots  and  the  small  cross  indicates  the  material  gain  maximum.  The  distinct 
mode  selectivity  should  be  considered  in  the  context  of  the  very  low  effective  refractive  index  contrast  of  the 
perturbed  laser  cavity. 
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In  order  to  judge  the  quality  of  each  search  point  a 
fitness  value  has  to  be  defined,  relying  on  the  forward 
solver’s  specific  output.  As  the  main  validation 
criterion  within  all  further  simulations  the  round-trip  gain 
G„  is  taken  in  terms  of  a  potential  mode-selectivity  at 
lasing  threshold.  The  round-trip  gain  G„  represents  the 
oscillation  condition  itself.  According  to  our  laser 
structure,  the  overall  fitness  is  defined  as  a  sum  of 
three  different  fitness  numbers:  one  concerns  the  side¬ 
mode  suppression  within  the  round-trip  gain 
spectrum.  A  second  term  validates  the  coincidence 
between  the  position  of  the  material  gain  peak  and  the 
specified  wavelength  of  852  nm.  The  third  term 
measures  the  wavelength-difference  between  the  lasing 
point  and  this  specification. 

Following  [8],  a  representation  scheme  ( Fig.l )  of 
the  multi-cavity  laser  structures  is  obtained  using  a 
fine-scale  discretization.  Assuming  a  maximal  laser 
length  of  L  =  1000  pm  and  a  discretization’s  resolution 
of  5L  =  5  jjm,  the  laser  topology  can  be  described  as  an 
array  with  L/SL  =  200  integers  each  representing  one 
segment  within  the  potential  laser  cavity.  Each  segment 
having  an  effective  refractive  index  Na  or  Nt  is  assigned 
to  an  integer  value  of  2  or  1  respectively.  A  “don't 
care”  represented  by  an  integer  value  of  0  does  not 
influence  the  decoding  operation  when  mapping  the 
integer  array  ( genotype )  into  its  corresponding  physical 
representation  (phenotype ). 

In  combination  with  genetic  operators  such  as 
crossover  and  mutation  the  optimization  procedure  has 
the  ability  to  build  up  lasers  with  different  lengths. 
Further  we  allow  the  optimizer  to  “decide”  how  the 
current  injection  into  the  laser  structure  has  to  be 
performed  when  searching  for  appropriate  numbers  and 
positions  of  contact  electrodes.  A  contacted  segment 
may  simply  be  marked  by  a  reversed  sign  of  its 
corresponding  integer  (allele)  leading  to  a  5-valued 
genotype  and  therefore  to  a  tremendous  large  search 
space  of  2005  ~  10140  search  points. 

The  performance  of  the  multi-cavity  laser  structure  is 
evaluated  by  applying  the  well  known  transfer-matrix 
analysis  [19].  All  material  properties  involved  such  as 
material  gain  and  the  carrier  induced  refractive  index 
change  are  obtained  from  optical  gain  measurements  and 
are  implemented  as  an  appropriate  spectral  model  [5]. 
The  effective  refractive  index  difference  representing 
the  perturbation  is  assumed  1.92- Iff2. 

Our  optimization  scenario  [8]  after  33720 
evaluated  individuals  yields  a  maximal  performing 
structure  ( Fig.2a )  with  a  fitness  of  1.056875-1&.  The 
spread  of  fitness  values  within  the  optimized  population 


is  around  4%.  It  should  be  noted  that  good  solutions 
(fitness  >  4-105)  are  already  achieved  after  less  than  700 
iterations.  The  round-trip  gain  spectrum  Gn  of  the  best 
performing  laser  structure  (Fig.2b)  shows  the  desired 
distinct  wavelength  selectivity  permitting  single  longi¬ 
tudinal  mode  lasing  operation  at  852.10  nm.  Here  the 
current  injection  reaches  a  threshold  value  of  11.98  mA 
when  lasing. 

3.  CONCATENATED  GRATING  FILTERS 

Wavelength  division  multiplexing  ( WDM)  at 
wavelengths  of  1520-1570  nm  in  optical  fiber  networks 
for,  e.g.,  2.488  Gb/s  data  rates  demands  (integrated)  optical 
filters  for  adding  and  dropping  single  wavelength  channels 
at  certain  network  nodes.  Bragg  grating  based  filters 
become  very  attractive,  when  the  requirements  for  intra- 
channel  crosstalk  are  stringent.  Unfortunately,  uniform 
Bragg  gratings  suffer  from  poor  sidelobe  suppression  in 
their  spectral  response.  If  only  a  certain  inter-channel 
crosstalk,  i.e.  a  certain  sidelobe  level  at  the  neighboring 
channel,  is  allowed  the  high  sidelobe  results  in  a  large 
channel  spacing  and  thus  in  a  small  bandwidth  utilization. 
In  order  to  circumvent  this  deterioration  apodized  grating 
structures  -  i.e.,  gratings  with  longitudinally  varying  mode 
coupling  constants  according  to  a  bell-like  weighting 
function  -  are  strongly  recommended. 

An  obvious  way  to  alter  the  coupling  strength  of 
surface  corrugated  gratings  consists  of  a  correspond- 


Number  of  evaluated  individuals 

Fig.3:  Fitness  evolution  of  different  grating  filter 
optimization  attempts  as  a  function  of  evaluated 
individuals.  A  discrete  valued  Hamming  distribution 
of  the  coupling  constant  acts  here  as  a  starting  guess 
for  the  initial  SDH. 
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Fig.4:  Simulated  spectral  response  of  a  concatenated 
grating  (solid  line)  and  of  the  equivalent  uniform 
grating  ( dotted  line).  Both  gratings  are  11  mm  long. 


ing  change  in  etch  depth  of  the  periodic  ridge  wave¬ 
guide  corrugation  (another  attempt  using  a  direct  UV- 
writing  technology  [20],  [21]  to  locally  change  the 
planar  glass  waveguide’s  effective  refractive  index  is 
still  under  investigation).  However,  to  preserve 
process  reproducibility  a  binary  grating,  e.g.,  a  constant 
etch  depth  is  preferred.  One  apodization  method  obey¬ 
ing  this  constraint  exploits  the  dependence  of  the 
coupling  coefficient  on  the  grating  duty  cycle  [22].  In 
this  approach  the  minimum  coupling  coefficient  is 


Fig.5:  Coupling  strength  distribution  along  the 
grating  for  the  optimized  concatenated  grating  (solid 
line)  and  several  conventional  taper  functions 
(Blackman  Junction  (dotted  line),  raised  sine  (dash- 
double  dotted  line),  sine  (dashed  line),  positive 
hyperbolic-tangent  profile  (dash-dotted  line)). 


determined  by  the  most  extreme  duty  cycle  that  is 
producible,  i.e.,  the  one  deviating  most  from  50%,  which 
has  to  be  found  experimentally.  We  found  a  minimum 
duty  cycle  of  about  10%  to  be  a  typically  achievable 
value  for  glass  waveguides  with  grating  periods  of  about 
500  nm  [14].  In  consequence,  any  apodization  function 
realized  within  our  production  technology  will  be 
truncated.  Classical  windowing  functions  of,  e.g.,  a 
Hamming  (or  a  raised  cosine)  shape,  suppress  all 
sidelobes  below  a  certain  level  (e.g.  -50  dB )  that  is  given 
by  the  function  itself  and  the  accurateness  of  its  practical 
realization.  Thus,  all  classical  windowing  schemes  tend 
to  perform  unsatisfactory  when  truncated  (for  the 
Hamming  window  the  sidelobe  level  raise  up  to  —14  dB 
when  this  apodization  function  has  to  comlpy  with  a 
minimal  available  duty-cycle  of  10%).  We  have  there¬ 
fore  decided  to  look  for  apodization  functions  that  are 
optimized,  taking  experimental  constraints  into  account 
with  the  more  pragmatic  goal  to  just  suppress  all 
sidelobes  outside  a  certain  bandwidth. 

The  choice  of  the  optimization  scheme  was  also 
influenced  by  the  discrete  nature  of  the  actual  problem 
representation:  the  gratings  are  usually  implemented  by  a 
vector  scan  electron  beam  lithography  system  with  a 
discrete  address  grid.  The  set  of  producible  duty  cycles 
and  hence  the  set  of  realistic  coupling  coefficients  is  thus 
given  once  the  writing  field  size  has  been  chosen.  The 
only  parameters  that  are  available  when  optimizing  the 
coupling  strength  profile  are  the  lengths  of  the  different 
grating  regions.  Furthermore,  each  length  should  be  an 
integer  multiple  of  its  corresponding  grating  period. 
Therefore,  finding  an  appropriate  apodization  scheme  - 
i.e.,  to  trace  an  appropriate  concatenation  of  different 
subgratings  -  always  represents  a  crucial  combinatorial 
optimization  problem  which  is  efficiently  solved  only  by 
a  genetic  algorithm  [5],  [6],  [8]. 

To  evaluate  the  gratings  we  first  have  to  define  the 
desired  crosstalk  levels,  e.g.,  an  intra-channel  crosstalk 
better  than  -30  dB  within  a  bandwidth  of  0.4  nm  and  an 
inter-channel  crosstalk  of  -25  dB  outside  a  bandwidth  of 
0.8  nm.  According  to  [23]  the  inter-channel  crosstalk 
requirements  for  neighboring  channels  is  less  strict  and 
amounts  to  -20  dB.  We  use  the  larger  value  to  give  the 
optimizer  a  larger  margin.  In  each  iteration  step  the 
grating  response  is  calculated  using  the  well  known 
transfer-matrix  method  [24],  According  to  the  given  filter 
specification,  the  overall  fitness  is  consequently  defined 
as  a  sum  of  two  different  fitness  constituents:  One 
number  validates  the  actual  spectral  filter  response  with 
respect  to  the  desired  inter-channel  crosstalk  and  a 
second  term  measures  the  spectral  deviation  with  regard 
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to  the  given  intra-channel  crosstalk  specifications.  Fig.3 
shows  the  fitness  evolution  for  a  grating  consisting  of  40 
grating  sections  with  corresponding  duty  cycles.  In  order 
to  compare  our  breeder  genetic  algorithm  (solid  line)  with 
alternative  optimization  schemes  we  have  also  plotted  the 
evolution  when  enabling  a  specific  simplex  downhill 
(SDH)  optimization  working  on  discrete  number  spaces 
(dotted  line).  As  starting  guess  for  the  coupling  strength 
distribution  we  used  a  discrete  valued  Hamming  function. 
Referring  to  the  corresponding  trace  in  Fig.3  it  is  clearly 
visible  that  the  simplex  downhill  method  gets  caught  in  a 
local  optimum.  Additionally,  we  have  stopped  our  genetic 
algorithm  after  a  certain  number  of  evaluated  individuals 
and  have  it  followed  by  a  simplex  downhill  optimization 
(several  dashed  lines).  The  simplex  downhill  usually  tends 
to  accelerate  the  down  tracking  of  promising  parameter 
sets  nearby  a  fitness  landscape’s  local  optimum.  But  it  is 
noteworthy  to  realize  that  a  prior  global  optimization 
procedure  is  always  mandatory. 

After  2000  iterations  (and  additional  1300  down  hill 
simplex  iterations)  a  representative  design  has  led  to  50 
grating  solutions  where  the  best  performing  one  has  a 
potential  bandwidth-utilization-factor  of  50%  at  an  intra¬ 
channel  crosstalk  of  -30  dB  and  an  inter-channel  cross¬ 
talk  of  -21  dB  close  to  the  Bragg  resonance  which 
complies  well  with  the  requirements  (Fig.4). 

As  shown  in  Fig.  5  the  3  pm  wide  ridge  waveguide 
Bragg  gratings  consist  of  40  different  subgrating  sec¬ 
tions  having  an  overall  length  of  11  mm.  All  of  them  are 
producible  in  an  inexpensive  planar  SiOfSiON  glass 
technology  with  an  available  etch  depth  of  100  nm. 
Comparing  our  design  approach  to,  e.g.,  commonly  used 
thin-film  interference  filter  synthesis  methods  [25],  our 
evolutionary  optimization  procedure  potentially  reveals 
an  objectionable  computational  effort.  But  from  the 
viewpoint  of  a  realistic  design,  this  sobering  prospect 
should  be  reassessed  into  a  promising  one  especially 
with  regard  to  our  design  procedure’s  feasibility  while 
including  all  critical  nonidealities  of  the  technological 
production  process. 


Fig. 6:  Example  of  a  planar  spot-size  converter.  For 
visualization  purposes  the  upper  cladding  is  not 
shown.  Only  changes  in  the  width  and  segmentation 
are  supported.  Such  structures  can  be  manufactured 
as  simply  as  a  normal  waveguide. 


benefit  of  allowing  small  bending  radii  on  the  order  of 
1  mm.  Therefore,  this  inexpensive  technology  meets  the 
requirements  for  dense  integrated  optics.  But  such  strong 
waveguiding  has  inevitably  its  drawback  considering  the 
mode  mismatch  at  an  optical  transition  between  chip  and 
single  mode  fiber.  Direct  butt-coupling  would  cause  losses 
of  more  than  3.5  dB.  In  order  to  reduce  these  losses,  the 
modal  shape  of  the  integrated  waveguide’s  fundamental 
mode  has  to  be  converted  into  a  shape  as  close  as  possible 
to  the  fundamental  fiber  mode. 


4.  ULTRA  SHORT  SPOT-SIZE  CONVERTER 

In  the  last  two  sections  we  described  how  our 
evolutionary  algorithm  can  be  used  to  comply  with  the 
spectral  specifications  within  a  design  procedure  of 
integrated  optical  devices.  The  example  being  now  under 
consideration  is  dedicated  to  the  spatial  treatment  of 
optical  fields  regarding  the  functionality  of  such  devices. 
Because  of  its  large  refractive  index  difference  (Sn  -0.02) 
the  planar  SiO/SiON  glass  waveguide  technology  has  the 


Fig.7:  The  fitness  evolution  through  the  converter  is 
shown  here.  The  real  structure  will  be  cut  at  the 
position  where  the  highest  fitness  is  obtained. 
Therefore  the  implemented  converter  is  usually 
considerably  shorter  than  the  total  structure.  The 
fitness  is  calculated  after  each  BPM  propagation 
step.  The  best  fitness  ever  encountered  ( here  at  about 
110pm,  shown  by  the  vertical  line )  is  retained  as  the 
overall  fitness  of  the  converter. 
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Several  approaches  how  to  transform  the  modal 
properties  are  already  known  [26],  Because  of  the 
difficulty  to  produce  vertical  tapering,  a  structure 
must  be  found  that  does  not  require  such  kinds  of 
additional  fabrication  steps. 

A  converter  structure,  which  is  easy  to  fabricate 
within  a  rigorous  planar  waveguide  concept,  consists 
of  a  segmented  waveguide  with  or  without  lateral 
tapering.  By  general  means,  such  spot-size  converters 
do  not  have  to  be  periodically  segmented  (Fig.6). 
Our  approach  [9]-[13]  leaves  an  evolutionary 
optimizer  to  “decide”  himself  how  much  tapering  and 
segmentation  is  needed  to  obtain  an  optimal  mode 
conversion. 

The  actual  problem  to  be  optimized  hence 
contains  a  chip  to  fiber  coupler  at  an  operational 
wavelength  of  1550  nm  where  the  width  of  the  ridge 
waveguide  is  3  pm  with  a  residual  layer  thickness  of 
about  1  pm,  and  the  single  mode  fiber  has  a  core 
diameter  of  9  pm.  The  coupling  loss  Lc  (including 
scattering  losses  within  the  spot-size  converter 
structure)  is  defined  as 


where  is  the  fundamental  mode  of  the  ridge 
waveguide,  *Fa  is  the  optical  field  after  the  spot-size 


Distance  Z  \\im) 

Fig. 8:  \Ey\-field  distribution  (TM -polarization)  within 
a  converter  structure.  Left  of  the  dashed  line  the 
width  of  the  original  waveguide  is  shown.  A  horizon¬ 
tal  slice  of  the  ridge  waveguide  is  superposed.  The 
expansion  of  the  propagating  field  is  clearly  visible. 


converter,  *Ff  is  the  fundamental  mode  of  the  fiber 
and  the  integration  is  performed  along  the  wave¬ 
guide’s  cross-sectional  plane  A.  The  optimization  goal 
is  to  find  a  suitable  structure  that  minimizes  the 
coupling  loss  Lc .  The  fitness  of  the  structure  is 
therefore  defined  as  F  =  1  -  Lc  and  has  apparently  to 
be  maximized. 

Similar  to  the  laser  problem  a  genotype  is  defined  as 
follows:  The  converter  is  divided  into  N  sections  of  2.7 
pm  length  (a  choice  which  is  motivated  mainly  by 
technological  reasons).  Each  section’s  width  is 
represented  by  a  multi-valued  bit,  where  each  bit  can 
hold  42  different  values.  Values  from  0  to  40  correspond 
to  the  real  width  of  the  waveguide  in  steps  of  0.5  pm  and 
-1  stands  for  “don't  care”.  The  “don't  care”  bits  are 
needed  to  leave  the  total  converter  length  variable.  Each 
converter  is  then  calculated  using  a  FD-BPM  (finite 
difference  beam  propagation  method)  based  code.  The 
fundamental  modes  of  both  the  waveguide  and  the  fiber 
are  calculated  with  the  imaginary  distance  BPM  [27]. 

The  evolutionary  algorithm  is  initialized  with  a 
starting  population  of  100  individuals  each  having  a 
maximum  length  of  70  sections.  The  fitness  value  F 
is  evaluated  after  each  propagation  step.  The  best 
fitness  ever  encountered  along  the  structure  is  taken 
as  the  nominal  fitness  of  the  corresponding  converter. 
An  example  of  the  fitness  distribution  within  an 
optimized  structure  is  shown  in  Fig.  7.  For  these  BPM 
simulations,  the  propagation  step  size  is  chosen 
0.25  pm  by  means  of  stability. 

The  best  performing  of  our  evolutionary  opti¬ 
mized  converter  topologies  was  achieved  after 
evaluating  only  10350  out  of  totally  4.24-10113 
possible  solutions.  It  consists  of  15  different  ridge 
waveguide  segments  and  reduces  the  coupling  loss 
from  3.5  dB  down  to  about  1.3  dB.  A  0  dB  coupling 
loss  is  hardly  possible  because  the  residual  layer  in 
the  waveguide  structure  severely  handicaps  the 
vertical  expansion  of  the  optical  field. 

The  optimal  converter  structure  corresponds  to  the 
topology  given  in  Fig. 6  and  the  optical  field 
distribution  is  shown  in  Fig.  8.  The  scattering  loss 
through  the  converter  structure  is  estimated  to  be  less 
than  0.2  dB  and  the  principal  neglect  of  power 
reflection  in  our  simulation  model  has  been  affirmed 
by  measurement  [12],  [13]  of  a  very  low  value  of 
-40  dB. 

The  final  converter  device  has  an  overall  length  of 
138  pm,  which  to  our  knowledge  represents  the 
shortest  spot-size  converter  ever  built  for  such  large 
refractive  index  steps. 
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5.  FREQUENCY  CARRIER  DISTRIBUTION 

Today,  fiber  optic  links  are  substantial  parts  of 
modern  communication  systems  [28].  It  is  therefore 
important  to  know  their  distortion  and  noise  proper¬ 
ties  [29].  Systems  with  subcarrier  multiplexing 
( SCM ),  in  which  often  equally  spaced  rf  carriers  with 
different  amplitudes  lie  within  a  narrow  band,  have 
very  low  intermodulation-distortion  ( IM)  specifica¬ 
tions,  as  do  common  antenna  television  systems 
(CATV).  In  optical  transmission  links  with  standard 
fibers  and  directly  intensity-modulated  lasers  at 
1.3  / lm ,  the  main  contribution  to  the  distortion  is  due 
to  mixed  -  static  and  dynamic  -  laser  nonlinearity 
[30].  In  such  communication  systems  only  odd  orders 
of  the  nonlinearity  have  to  be  considered  when  a  weak 
nonlinearity  is  assumed. 

It  is  rather  the  resulting  3rd  order  IM  which  is  of 
technical  relevance  [31].  Having,  e.g.,  a  transmission 
band  of /,...,/„  equally  spaced  //carrier  frequency 
channels,  where  Me  is  assumed  to  be  the  set  of 
operational  carrier  indices,  then  3rd  order  IM 
generates  mixing  products  of  the  following  kind: 
f+fk-f,  fr-fk+f,  -fi+fk+f, ,  V  i,  k,  l  e  Mc.  All  mixing 
products  which  coincide  with  a  frequency  fr  within 
the  transmission  band  obey  i+k-i  =  r  or  i-k+i  =  r  or 
-i+k+£  =  r,  V  i,  k,  i  e  Me ■ 

In  order  to  propose  Me  as  an  optimal  carrier 
distribution,  one  has  to  look  for  operational  //carrier 
frequencies  within  the  transmission  band  whose  IM 
products  do  minimally  interfere  amongst  themselves 
as  well  as  with  their  engendering  carriers,  respec¬ 
tively. 

In  an  ideal  case,  where  one  simply  wants  to 
prevent  a  carrier  to  overlap  with  those  IM  products 
stemming  from  the  remaining  ones,  all  distances 
between  pairs  of  carrier  frequencies  should  be 
different  like  i-t  *  r-k.  A  set  Mc  with  such  prop¬ 
erties  is  also  called  “Golomb  ruler”  [17]  when 
containing  0  as  an  additional  element.  Therefore, 
placing  N  operational  carriers  within  a  minimal 
transmission  bandwidth  of  n  >  N  channels,  means 
nothing  else  than  looking  for  a  preferably  short 
Golomb  ruler  whose  largest  element  should  be  as 
small  as  possible. 

Computational  solutions  are  only  available  for 
n  »  16  >  N.  Thus,  considering  dense  carrier  distri¬ 
butions  inevitably  leads  to  a  combinatorial  optimi¬ 
zation  procedure,  where  a  minimal  intermodulation- 
to-carrier-ratio  (IM/C)  should  be  aspired  for  occupied 


ith  generation  channel  number 


Fig. 9:  Optimal  distribution  of  6  different  carriers 
within  15  equally  spaced  transmission  frequency 
channels.  (Left)  fitness  evolution  during  optimization, 
(right)  transmission  band  with  optimally  placed 
carriers  (shown  as  bars). 

channels  as  well.  The  optimization  task  becomes  even 
more  severe  when  taking  into  account  different  carrier 
amplitudes.  There  are 


combinations  of  how  to  distribute  N  operational 
carriers  within  n  transmission  channels.  Assuming  a 
given  set  of  N  different  amplitudes  within  each 
distribution  pattern  additional  M2  =  N!  permutations 
of  carrier  amplitudes  have  to  be  taken  into  account. 
As  genotype  of  a  particular  carrier  distribution,  we 
define  a  bit-string  representation  for  a  pair  of  ordinal 
numbers  (mi,  m2)  V  mi  e  [1,  Mi],  m2  e  [1,  M2],  where 
the  first  of  them  addresses  the  combination  state  of 
the  particular  pattern  and  the  second  characterizes  its 
permutation  state  respectively.  The  fitness  of  a 
particular  pattern  is  then  calculated  with  respect  to  the 
worst  IM/C  of  all  occupied  transmission  channels 
involved. 
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Our  exemplary  evolutionary  optimization  prob¬ 
lem  [17]  includes  a  set  of  6  given  carrier  amplitudes 
to  be  placed  within  a  transmission  band  of  15  equally 
spaced  frequency  channels.  As  optimizer  we  use  a 
standard  genetic  algorithm  (generation  based  genetic 
algorithm:  traditional  one-point  crossover,  60% 
selection  probability,  1%  mutation  rate)  which 
operates  on  a  population  size  of  300  individuals.  A 
best  performing  solution  was  found  after  30  of  totally 
60  generations.  Fig. 9  shows  the  optimal  carrier 
distribution  leading  to  a  minimal  3rd  order  IM  distor¬ 
tion  of  the  fiber  optic  SCM-link. 

The  optimization  problem  presented  here  is  also 
of  prime  importance  regarding  the  design  of  very 
advanced  optical  WDM-systems.  For  high-speed 
WDM- systems  the  simultaneous  requirements  of  high 
launched  power  and  vanishing  fiber  dispersion  lead  to 
the  generation  of  new  optical  frequencies  by  four- 
photon  mixing.  These  generated  waves  can  interfere 
with  system  operation  while  degrading  the  system 
capacity  by  intermodulation  distortion  and  additional 
noise  generation  in  band  limited  erbium  doped  fiber 
amplifiers  ( EDFAs ).  In  order  to  prevent  phase 
matching  of  these  waves  one  is  tempted  to  allow  a 
small  amount  of  fiber  dispersion  at  an  additional 
expense  of  system  capacity  [32].  Hence,  an  optimiza¬ 
tion  of  optical  carrier  distribution  enables  the 
reduction  of  intermodulation  distortion  without  need 
of  any  dispersive  fiber. 

6.  DIELECTRIC  MATERIAL  MODELS 

In  this  section,  we  report  an  evolutionary 
optimization  based  method  for  the  determination  of 
the  dispersive  dielectric  properties  eff)  of  natural 
materials  exhibiting  high  dielectric  and  ohmic  losses 
over  a  wide  frequency  range.  Accurate  information  on 
the  dependence  of  dielectric  properties  of  (mixtures 
of)  natural  materials  on  content  of,  e.g.,  water  or 
hydrocarbons,  and  also  on  temperature  is  of  con¬ 
siderable  importance  in  a  number  of  applications,  e.g., 
in  environmental  engineering,  geophysics,  mathematical 
geology  and  chemical  process  engineering.  The  micro¬ 
structure  of  such  multiphase  mixtures  are  generalized 
by  a  structural  material  matrix  representing  the 
characteristic  distribution  of  its  constituents.  This 
concept  of  structural  units  [33]  -  which  is  a  picture 
for  capturing  the  microstructural  and  compositional 
information  of  the  randomly  distributed  constituents 
within  a  dielectric  host  material  -  becomes  particularly 
attractive  when  linked  to  an  accurate  spectral  dispersion 
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Fig.  10:  (Top)  relaxation  spectra  g„(frn)  and  (bottom) 
Cole-Cole  plot  of  £/f)  for  a  volumetric  water  content 
of  (left)  0  =  0,  and  (right)  ©  =  15%  where  the 
relaxation  frequency  of  free  water  is  clearly  repro¬ 
duced  by  the  proposed  model.  The  frequency  range  of 
the  measured  scattering  data  is  f=  10  MHz.  ■  ■ 3  Ghz. 


model  in  an  effective  medium  approach.  Hence, 
disposing  of  such  an  accurate  macroscopic  description 
of  dielectric  mixtures  could  even  have  a  seminal 
impact  on  advanced  topics  in  physical  optics  such  as 
wave  localization  phenomena  due  to  random 
scattering,  photon  diffusion,  coherent  backscattering 
and  has  yet  led  to  the  diffusive  wave  spectroscopy  as  a 
new  optical  measurement  technique  in  material 
science  and  food  engineering  [33]. 

The  analytical  material  model  presented  here  is 
extracted  from  electromagnetic  scattering  data  of  a 
corresponding  coaxial  transmission  line  measurement 
setup.  Following  the  classical  Debye  model  for  the 
relative  permittivity  we  propose  a  weighted  linear 
superposition  of  N  different  Debye  models 
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where  Es  stands  for  the  static  limit,  for  the  high 
frequency  limit,  £o  describes  the  vacuum  permittivity, 
Ojiti  accounts  for  the  ohmic  conductivity  of  the  material 
involved,/"  represents  the  relaxation  frequency  of  the  n- 
th  Debye  model  and  gn(frn)  defines  a  normalized 
relaxation  weighting  function  which  on  itself  is  com¬ 
posed  by  a  finite  set  of  G  different  Gaussian  relaxation 
functions.  Choosing  such  a  finite  base  of  the  relaxation 
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spectra  gn(.)  mainly  helps  to  circumvent  the  ill-posedness 
of  the  model  estimation  problem.  The  genotype  consists 
of  an  appropriate  binary  representation  of  all  parameter 
values  to  be  optimized.  The  parameters  include  the 
weightings,  the  relaxation  frequencies  and  bandwidths  of 
the  numerous  Gaussian  relaxation  components,  the 
conductivity  <r&/  and  both  limits,  8,  and  of  the 
permittivity  model.  We  can  define  the  fitness  of  a 
potential  solution  as  the  quality  of  the  approximation  of 
calculated  and  measured  scattering  spectrum  respec¬ 
tively.  Referring  to  the  matching  of  the  scattering  phase 
between  analytical  model  and  measured  data  the 
resulting  fitness  function  behaves  like  a  jagged  multi¬ 
modal  landscape  provoking  serious  pitfalls  for  common¬ 
ly  used  optimization  algorithms. 

As  an  evolutionary  optimized  example  we  present 
the  analytical  description  of  Bentonite,  a  highly  lossy, 
very  complex  clay  like  material  with  and  without 
volumetric  water  content  0  [18]  at  a  temperature  of 
23°C.  The  behavior  of  our  estimated  model  is  shown  in 
Fig.  10,  whereas  the  corresponding  parameters  can  be 
obtained  from  the  following  table  Tab.]. 


0  =  0 

0  =  15% 

#  individual 

21373 

21552 

es  H 

12.1236 

30.4155 

e~  [-] 

3.00012 

2.18958 

<?diel  [mS] 

18.314 

99.9847 

Tab.l:  Optimized  parameter  set  for  Bentonite  at  two 
different  humidity  states. 

To  conclude  we  derived  a  very  general  analytical 
material  model  for  complex  and  highly  lossy  dielectric 
materials  which  outperforms  commonly  used  Debye 
models  in  terms  of  flexibility  and  accuracy  as  well. 
Our  approach  is  able  to  cover  different  distinct 
relaxation  phenomena  which  are  not  easily  tractable 
within  a  straight  forward  ab  initio  dispersion  formula. 


7.  PROBLEM-BASED  ALGORITHMIC 
PROSPECTS 

We  have  demonstrated  evolutionary  algorithm’s 
applicability  to  various  optimization  problems  within 
the  field  of  computational  optics  and  electromagnetics. 


After  all,  this  is  because  most  of  such  real-world 
problems  could  easily  be  transformed  into 
combinatorial  problems  as  well,  where  evolutionary 
algorithms  and  especially  genetic  algorithms  are 
claimed  to  belong  to  the  best  suited  ones  compared  to 
other  heuristic  optimization  codes.  In  addition,  this 
kind  of  optimization  scheme  delivers  much  more 
general  information  about  what  actually  leads  to  a 
good  solution.  Therefore,  it  permits  us  to  implement 
superior  meta-optimization  strategies  which  rely 
on,  e.g.,  a  population  based  information  gathering. 
Such  an  information  gathering  procedure  includes 
structural  information  concerning  typical  patterns  [8] 
within  optimized  individuals  as  well  as  temporal 
information  [1 1]  of  the  evolution  process  itself.  In  the 
following,  both  types  of  information  gathering  will 
be  elucidated  in  the  context  of  a  corresponding 
application. 

7.1  STRUCTURAL  INFORMATION  PROC¬ 
ESSING  IN  THE  CONTEXT  OF  MULTI- 
CAVITY  LASER  DIODE  OPTIMIZATIONS 

All  optimization  scenarios  presented  in  Section  2 
appear  to  converge  to  an  optimal  laser  structure  and  it 
seems  that  not  even  a  continuation  of  the  optimization 
process  up  to  some  higher  iteration  number  enables 
the  generation  of  better  performing  individuals.  In 
addition,  most  of  the  statistically  available  informa¬ 
tion  concerning  a  “final”  state  of  a  population’s 
evolution  (e.g.,  the  decreasing  spread  of  fitness 
values)  usually  lacks  in  reproducing  the  optimizer’s 
potential  for  a  further  improvement. 

Therefore  a  structural  analysis  of  all  individuals,  i.e., 
searching  for  frequent  and  successful  patterns  within  this 
optimized  population  could  probably  answer  two 
questions:  First,  is  such  an  information  gathering 
procedure  capable  of  delivering  a  novel  population 
whose  prospects  look  more  promising  within  a  further 
optimization  attempt?  Second,  is  it  also  possible  to 
formally  acquire  insight  as  to  what  actually  leads  to  well 
performing  laser  structures? 

The  information  gathering  based  on  pattern  analysis 
[8]  is  simply  done  by  evaluating  the  frequency  of 
appearance  of  characteristic  Q  bit-pattem  (Q  <  L/SL) 
within  the  population.  By  stepping  a  Q  bit  wide  window 
along  each  individual’s  genotype  a  corresponding 
number  of  different  Q  bit-strings  can  be  extracted.  All 
these  strings  are  then  sorted  according  to  their  pattern 
label,  thus  assigning  each  pattern  to  its  frequency  of 
appearance  ( Fig  lib).  A  similar  procedure  delivers  the 


52 


ACES  JOURNAL,  VOL.  15,  NO.  2,  JULY  2000  SI:  GENETIC  ALGORITHMS 


b)  a) 


Fig.ll:  Pattern  analysis  considering  the  final  population  of  the  optimization  scenario  described  in  Section  2: 

a)  Distribution  of  characteristic  18-bit-pattems  along  the  laser  structure  and  ranked  by  its  frequency  of  appearance. 

b)  Corresponding  frequency  of  appearance  of  these  patterns.  For  visualization  purposes  the  pattern  analysis  has  been 
restricted  only  to  the  high  and  low  refractive  index  segments  of  the  decoded  cavity  structure  (left).  Typical  cavity 
refractive  index  pattern  deduced  from  the  18-bit-pattem  distribution  (top  right).  The  corresponding  non-periodic 
coupled  cavity  laser  structure  (bottom  right)  consists  of  45  sections  and  has  a  total  length  of 700  \im. 


most  frequent  position  for  every  Q  bit-pattern  within  this 
ranking,  leading  to  the  distribution  scheme  shown  in 
Fig.  11a).  Finally,  the  distribution  of  characteristic  Q  bit- 
pattems  enables  us  to  deduce  a  typical  laser  structure 
which  is  believed  to  gather  all  the  specific  information 
needed  to  qualify  as  a  good  solution.  The  typical  laser 
structure  of  Fig.ll  is  obtained  by  counting  each  specific 
allele  value  of  all  pattern  sequences  at  the  considered 
segment  position.  The  counting  procedure  itself  employs 
a  weighting  which  is  proportional  to  the  pattern’s 
frequency  of  appearance.  Therefore,  the  most  frequent 
parts  of  patterns  will  always  obtain  recognition. 
Choosing  pattern  lengths  between  Q  =  3  bit  and  Q  =  90 
bit  up  to  88  different  typical  laser  structures  can  be 
obtained  contributing  partly  to  a  novel  starting  popu¬ 
lation  for  a  further  optimization. 

In  order  to  validate  a  population’s  diversity  D  a 
particular  non-binary  definition  of  the  Hamming- 
distance  [6]  has  to  be  specified.  We  therefore  investi¬ 
gate  the  distribution  8D(m) 

8D{m)  =  --  (bt  {m),  b]  (m)) 

which  measures  the  average  number  of  appearance  of 
incongruous  alleles  at  the  m-th  genotype  position 


considering  all  N  integer  strings  of  the  population, 
whereas  p»  values  the  incongruity  between  string 
fc.  and  bj  at  position  m.  The  summation  of  SD(m) 

over  the  total  string  length  immediately  yields  the 
diversity  D  mentioned  above. 

Within  the  optimization  scenario  presented  in  Sec¬ 
tion  2  different  population  stages  have  been  analyzed 
according  to  the  appearance  of  common  patterns.  As 
an  example,  the  information  gathering  procedure  has 
yielded  15  typical  laser  structures,  forming  a  novel 
population,  with  some  individuals  performing  even 
better,  and  whose  diversity  is  around  13  bit.  This 
represents  a  distinct  increase  compared  to  the  8  bit  of 
the  considered  underlying  population.  Further  details 
of  the  re-optimization  process  including  such  typical 
laser  structures  are  elaborated  in  [8]. 

Coming  back  to  the  typical  laser  structure  shown 
in  Fig.ll  it  can  be  noted  that  especially  the  regions 
neighboring  the  two  laser  facets  are  strongly 
correlated  and  imply  a  certain  robustness  against 
optimization  interferences.  Thus,  changing  segments 
from  inner  regions  of  the  cavity  has  proved  as  a  more 
successful  policy  while  tracking  down  well 
performing  laser  topologies.  This  assumption  is 
clearly  confirmed  when  investigating  the  distribution 
8D(m).  Inspecting  the  configuration  shown  in  Fig  12 
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Fig.  12:  Diversity  distribution  SD  (bold  line)  mapped 
along  a  corresponding  decoded  cavity  configuration 
considering  all  genotypes  of  the  underlying  final 
population  i.e.  of  the  optimization  scenario.  The  shaded 
sections  indicate  locations,  where  the  congruence  of  all 
genotypes  tends  to  be  exact  and  the  optimizer’s 
interference  is  therefore  believed  to  be  negligible 
whereas  the  gaps  stand  for  the  position  of  distinct 
incongruity  within  the  genotypes  involved.  The 
summation  of  SD  over  the  total  string  length  z 
immediately  yields  the  diversity  D. 


one  may  be  tempted  to  allocate  the  shaded  regions  to 
resistant  characteristic  patterns.  But,  because  of  its 
different  algorithmic  background  neither  the  structure 
given  by  the  shaded  regions  in  Fig.  12  nor  the  typical 
cavity  topology  of  Fig.  11  are  rigorously  comparable 
to  each  other.  The  typical  cavity  topology  is  generated 
when  gathering  the  common  pattern  information 
within  a  population  whereas  the  structure  given  in 
Fig.  12  puts  the  focus  on  all  its  differences. 

In  conclusion,  our  characteristic  pattern  analysis 
reveals  a  noteworthy  feature:  Nearly  independent  of 
the  state  of  a  population’s  convergence  the  proposed 
information  gathering  procedure  delivers  mostly  one 
individual  whose  fitness  exceeds  that  of  the  best 
performing  structure  of  the  underlying  population. 
Therefore  we  suggest  our  information  gathering  be 
used  as  a  sort  of  meta-optimization  strategy.  Increas¬ 
ing  a  population’s  diversity  without  degrading  the 
corresponding  fitness  could  be  regarded  as  a  useful 
mean  to  revitalize  a  population’s  prospect  when 
looking  forward  to  a  further  optimization  attempt  [8]. 


7.2  TEMPORAL  EVOLUTION  ASPECTS  IN 
THE  SPOT-SIZE  CONVERTER  DESIGN 

Our  evolutionary  optimization  scenario  presented 
in  Section  4  also  delivers  temporal  information  which 
may  be  reassessed  in  the  framework  of  a  superior 
solution  strategy.  One  of  the  main  differences 
between  classical  heuristic  optimization  procedures 
such  as,  e.g.,  Monte  Carlo  or  simple  hill-climbing 
methods  and  evolutionary  optimization  procedures  is 
their  implicit  parallel  search  mechanism.  As  it  is 
demonstrated  later,  any  successful  converter  contains 
characteristic  substructures  that  significantly  contrib¬ 
ute  to  good  performance.  In  our  procedure  it  is 
possible  to  keep  track  of  such  substructures  during 
evolution.  In  order  to  obtain  the  corresponding  data  of 
the  traces,  substructures  of  10  segments  length  were 
compared  using  a  sort  of  relaxed  structural  correlation 
scheme:  If  no  more  than  3  segments  of  that  substruc¬ 
ture  differ  from  one  individual  to  another,  both 
individuals  are  considered  to  be  part  of  the  same 
trace.  The  iteration  index  within  the  evolution  process 
and  the  fitness  of  all  individuals  taking  part  of  a  trace 
are  stored. 

We  can  think  of  three  different  types  of  traces 
questioning  the  following:  (1)  Traces  from  the  initial 
population:  Are  substructures  of  the  initial  population 
still  persistent  in  a  later  evolution  stage?  (2)  Back¬ 
ward  traces  from  distinct  fitness  jumps:  Which  trace 
is  mainly  responsible  for  the  increase  in  performance, 
or  which  characteristic  substructure  is  part  of  this  best 
performing  individual?  (3)  Backward  traces  from  the 
final  population:  How  many  traces  and  which 
substructures  constitute  the  final  population? 

Referring  to  the  survivability  of  the  initial 
population’s  substructures  it  is  observed  within  our 
specific  example  [11],  that,  even  when  most  of  the 
patterns  die  out  within  the  first  25%  of  the  optimi¬ 
zation  process,  there  are  still  two  traces  that  play  a 
major  role  during  the  overall  evolution.  This  shows 
that  proper  initialization  -  i.e.,  the  initial  population’s 
quality  of  diversity  -  may  have  a  considerable  impact 
on  the  evolution’s  outcome.  Different  initialization 
schemes  (e.g.,  using  deterministic  or  heuristic  number 
generators  instead  of  standard  pseudo-random 
processes)  are  now  under  extensive  investigation. 

The  history  of  substructures  which  provoke 
distinct  fitness  jumps  reveals  the  coexistence  of 
different  competing  patterns  within  the  evolving 
population.  Some  substructures  will  temporarily  be  at 
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Fig.  13:  To  observe  if  there  are  still  different  sub¬ 
populations  in  the  actual  or  final  population,  a  trace 
back  to  earlier  stages  of  the  population ’s  evolution 
may  be  created.  By  doing  so,  it  is  possible  to  observe 
how  the  evolution  of  sub-populations  takes  place. 
Therefore  the  parallelism  in  the  evolution  is  clearly 
visible.  For  these  examples,  the  backward  traces  are 
shown  for  a  population  at  7300  evolution  steps. 

the  top  of  the  population’s  fitness  ranking,  while 
others  are  successful  another  time  [11], 

Considering  the  traces  that  constitute  a  final 
population  (as  depicted  in  Fig.  13)  this  competition  of 
patterns  turned  out  to  be  a  mean  measure  when 
qualifying  an  optimizer’s  potential  termination  state: 


Fig.  14:  Value  of  the  evolution  figure  during  the 
optimization.  Four  phases  may  be  distinguished  where 
the  labelling  is  proposed  for  visual  purposes  only. 


qualifying  an  optimizer’s  potential  termination 
state:  Each  substructure  may  be  interpreted  as  a 
part  of  a  sub-population  of  individuals  containing 
this  unique  pattern,  exemplifying  as  well  that 
parallel  optimization  of  different  structures  takes 
place  even  in  a  final  evolution  state.  To  dispose  of 
different  sub-populations  at  such  stages  underpins 
the  impact  of  cross-over  at  the  expense  of 
mutation,  indicating  the  optimization  being  still  in 
an  efficient  operation  mode  compared  to  a  purely 
statistically  driven  random  search  process.  Thus, 
quantifying  the  vitality  of  a  population  after  n 
iteration  steps  a  state  variable  may  be  defined  as 
follows  [11] 

1  NSP(n) 

where  F(n)  stands  for  the  temporal  maximum 
fitness,  Nsp  represents  the  total  number  of  sub¬ 
populations  and  FiSP(n)  assigns  the  maximum 
fitness  within  the  z'-th  sub-population.  Figl4  shows 
the  evolution  of  Cp(n),  whereas  a  categorization 
containing  four  different  phases  in  the  evolution 
process  has  been  proposed.  Here,  Cp(n)  may  be 
viewed  as  a  specific  representation  of  the  number 
of  competing  patterns  within  the  population 
involved. 

7.3  EPILOGUE 

We  believe,  when  provided  with  both  structural 
and  temporal  information  of  a  population’s 
evolution  one  should  be  able  to  define  certain 
measures  [8],  [11]  concerning,  e.g.,  the  vitality  of 
the  population  or  even  a  specification  of  its  actual 
state  of  evolution.  In  order  to  underpin  such 
ventured  conjectures  extensive  statistical  investi¬ 
gations  are  strictly  inevitable,  including  also  a 
much  broader  spectrum  of  examples  than  presented 
here.  However,  a  lack  of  generality  considering 
all  attempts  when  formalizing  the  evolutionary 
algorithm’s  learning  process  will  always  remain. 
Therefore,  other  promising  combinatorial 
optimization  methods  have  to  be  compared  when 
relying  on  an  evolutionary  paradigm.  For  the 
assessment  of  problem  specific  search  space 
characteristics,  hybridization  of  evolutionary 
algorithms  with  other  methods  should  be 
investigated  as  well. 
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8.  TOWARDS  COMPUTER  GUIDED 
ENGINEERING 

Apart  from  the  algorithmic  considerations 
depicted  in  the  previous  Section  7  we  will  now  briefly 
sketch  two  lines,  where  our  research  on  evolutionary 
optimization  in  computational  optics  is  about  to 
advance.  Within  both  strategies  we  always  rely  on  the 
gathering  of  specific  information  regarding,  e.g.,  the 
actual  shape  of  the  structure  involved,  the  simulator’s 
peculiarities  and  even  the  functional  dependencies  on 
the  circuit  level. 

8.1  IMPROVEMENTS  WITHIN  ADVANCED 
DEVICE  OPTIMIZATION  PROBLEMS 

At  present  we  are  strongly  involved  in  the  design 
of  complex  smart  planar  optical  transducer  elements 
for  (bio-)chemical  and  physical  sensor  systems. 
Within  these  activities  we  believe  we  will  obtain  a 
deeper  insight  into  the  mechanisms  of  optical 
coupling  and  for  the  design  of  new  grating  couplers 
[35],  especially  of  ultra-compact  highly  non-periodic 
coupler  topologies.  A  rigorous  design  of  such  dense 
electromagnetic  field  coupling  configurations  usually 
represents  an  inverse  scattering  problem,  which  can 
only  be  solved  with  a  combination  of  highly  sophisti¬ 
cated  codes  for  computational  electromagnetics 
coupled  to,  e.g.,  an  evolutionary  optimizer. 

When  one  links  such  optimization  procedures 
with  such  simulation  tools,  one  faces  several  difficult 


problems.  As  its  main  task  the  code  for  computational 
electromagnetics  solves  a  so-called  forward  problem 
for  the  optimization  procedure.  Even  when  the  time 
spent  for  the  forward  problem  is  long,  the  results  have 
a  limited  accuracy.  This  may  cause  some  noise  within 
the  data,  which  considerably  disturbs  the  search 
process.  Thus,  the  forward  problem  has  to  be  solved 
many  times.  Referring  to  these  issues,  three  different 
specifications  should  be  respected  when  carefully 
looking  for  an  appropriate  forward  solver:  1.)  The 
simulation  program  should  be  as  efficient  as  possible, 
2.)  it  should  maintain  a  complete  robustness  while 
possibly  treating  solutions  not  even  thought  to  exist, 
and  3.)  it  is  mandatory  that  the  solver  delivers  an 
error  measure  in  order  to  guarantee  a  certain 
accurateness  of  the  search  process. 

The  multiple  multipole  ( MMP )  method  [34]  is  a 
well-established,  semi-analytical  tool  for  solving 
time-harmonic  2D  and  3D  scattering  problems  within 
piecewise  linear,  homogeneous  and  isotropic 
domains.  It  is  based  on  the  generalized  multipole 
technique  (GMT).  With  MMP,  the  field  ft,  within 
individual  domains  D  is  approximated  by  a  sum  of  N 
cylindrical  or  spherical  multipole  expansion  functions 
foj 

N 

fo  ~  f  do  "*■  S  '  f  Dj  4*  Error 

j= i 

which  are  themselves  analytical  solutions  of  the 
Helmholtz  equation,  whereto  stands  for  the  excitation. 


Fig.15:  MMP  calculation  of  a  single  slab  waveguide  perturbation  pattern:  (left)  Intensity  plot  of  the  time- 
averaged  Poynting  field  for  TE-excitation  from  the  left  side,  (right)  Distribution  of  the  corresponding  multipole 
expansions  (each  multipole  location  is  indicated  by  a  small  circle,  boundaries  are  drawn  as  solid  lines).  The  slab 
waveguide  system  consists  of  a  Ti02  core  layer  (thickness  150  nm),  a  H20  upper  cladding  layer  and 
polycarbonate  as  lower  cladding  respectively.  The  two  grooves  (1:  width  100  nm,  depth  20  nm;  2:  width  40  nm, 
depth  30  nm)  are  separated  by  200  nm.  The  operating  wavelength  is  785  nm  (vacuum). 
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The  origins  for  multipole  expansions  are  usually  set 
along  the  boundary  of  the  domains  in  which  the  field 
is  to  be  calculated.  For  the  field  around  voluminous 
domains  Hankel- type  expansions  are  used  whilst 
Bessel- type  expansions  are  preferred  inside.  Other 
special  functions  are  included  as  well,  e.g., 
propagating  and  evanescent  plane  waves.  The 
coefficients  ADj  are  obtained  by  enforcing  the 

boundary  conditions  for  the  field  components  at 
discrete  matching  points  on  the  boundary.  Since  more 
matching  points  are  introduced  than  necessary,  the 
MMP  method  leads  to  an  overdetermined  system  of 
equations.  This  system  is  solved  in  the  least-square 
sense  which  is  equivalent  to  an  error  minimization 
technique.  Thus,  an  adequate  error  measure  is 

inherently  delivered  by  the  method  itself. 

In  order  to  maintain  robustness  during  an  optimi¬ 
zation  scenario,  MMP  should  be  insensitive  to  all 
parameter  variations  involved.  Here,  the  most 

challenging  task  is  to  successfully  adapt  the 

simulation  to  repeated  changes  of  the  coupler’s 
grating  shape.  For  that  reason  we  have  developed  a 
fully  automatic  pole-setting  procedure  which  allocates 
all  multipole  expansions  needed  along  their  cor¬ 
responding  boundaries.  The  proper  setting  takes  into 
account  several  properties  of  the  actual  shape  as  well 


Fig.  16:  Non-periodic  grating:  Polar  plot  of  the 
radiated  far-field  ( time-averaged  Poynting  field)  for 
TE-excitation  from  the  left  side.  The  inset  shows  the 
7  fold  concatenation  of  various  single  perturbations 
as  described  in  Fig.  15. 
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as  it  considers  implicit  portions  such  as,  e.g.,  the 
curvature  and  its  context  within  the  boundary’s 
devolution.  The  MMP  calculation  shown  in  Fig.  15  is 
fully  based  on  the  automatic  pole  distribution  proce¬ 
dure  and  it  concerns  a  preliminary  perturbation 
pattern  which  may  constitute  a  grating  coupler  within 
our  typical  sensor  configuration. 

Besides  the  semi-analytical  nature  of  MMP ,  there 
are  further  algorithmic  potentialities  when  improving 
the  program’s  efficiency.  The  parameter  estimation 
technique  { PET)  is  a  very  powerful  technique  that  can 
be  applied  to  numerical  codes  based  on  dense 
matrices  as  a  power  booster  for  the  computation  of 
the  response  of  electromagnetical  or  optical  problems 
at,  e.g.,  different  frequencies.  It  is  applied  to  the 
multiple  multipole  {MMP)  method  in  conjunction 
with  the  method  of  conjugate  gradients  (CG)  for 
iteratively  and  efficiently  solving  the  rectangular 
MMP  matrix.  The  general  idea  of  the  parameter 
estimation  technique  {PET)  is  the  evolutionary 
recycling  of  knowledge.  Since  all  the  expansion 
parameters  AD}k)  (and  functions  fDj)  are  usually  known 
from  previous  1... k  runs  while,  e.g.,  sweeping  the 
wavelength  A,  recycling  of  knowledge  means  nothing 
else  but  a  pertinent  extrapolation  technique  for 
estimating  the  parameters  ADjk+I>  to  be  computed  in 
the  current  run  k.  This  speedup  technique  has  already 
been  detailed  in  earlier  contributions  to  ACES 
publications  [15],  [16]. 

The  most  powerful  mean  to  economize  computa¬ 
tional  effort  can  be  achieved,  when  focusing  solely  to 
characteristic  portions  of  the  overall  coupler  structure. 
Hence,  we  have  developed  a  near-to-far-field  trans¬ 
formation  which  allows  the  radiation  field  of  a  wave¬ 
guide  perturbation  being  approximated  simply  by  a 
single  particular  multipole  expansion.  Each  partial 
perturbation  pattern  can  be  analyzed  within  minutes 
and  is  then  at  the  optimizer’s  disposal.  Having 
available  a  library  of  such  generic  far-field 
expansions,  the  radiation  field  of  the  overall  coupler 
topology  is  immediately  calculated  when  placing  the 
particular  expansions  accordingly.  Fig.  16  depicts  the 
far-field  of  a  grating  structure  consisting  of  a  seven 
fold  concatenation  of  the  perturbation  analyzed  in 
Fig.15.  Within  the  scope  of  a  realistic  optimization 
scenario,  the  scalability  due  to  the  problem’s 
complexity  may  be  less  severe,  inasmuch  a  speedup 
of  around  two  orders  of  magnitudes  has  become 
achievable.  Constituting  the  field  solution  of  highly- 
non-periodic  grating  structures  as  to  the  same  degree 
of  simplicity  like  in  periodic  ones  (treating  the 
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Fig.  17:  User  interface  of  the  developed  design  plat¬ 
form.  (left)  Formal  description  of  the  waveguide 
elements,  (right)  View  of  the  corresponding  planar 
integrated  optical  circuit  topology. 


grating’s  unit-cell  with  periodic  boundary  conditions) 
[36]  reveals  an  unique  attractiveness  especially  when 
targeting  irregular  topologies.  This  allows  us  to  face 
novel  design  scenarios  leading  probably  to 
unexpected  topological  coherence  and  implying 
readjusted  representation  schemes. 

8.2  MOVING  TOWARDS  THE  CIRCUIT 
LEVEL 

On  the  system  level,  we  are  facing  yet  one  of  the 
most  demanding  inverse  problems:  designing  an 
entire  integrated  optical  circuit  based  solely  on  optical 
specifications.  Resting  on  the  expertise  of  the 
optimization  examples  presented  earlier,  our  research 
is  now  focused  to  the  development  of  a  design 
platform  for  planar  integrated  optics  devices.  This 


Inverse  Problem  Solver 


Fig.  18:  General  architecture  of  the  developed  design 
and  optimization  platform. 


design  environment  whose  user  interface  is  imaged  in 
Fig.  17  relies  on  sophisticated  representation  schemes 
for  device  geometries  based  on  elementary  waveguide 
structures  (e.g.,  straight  waveguides,  bends  and 
tapers).  While  performing  a  semantic  analysis  the 
program  is  able  to  identify  the  potential  functionality 
of  a  combination  of  such  elements  leading  to  “auto 
generated”  optical  circuits  including,  e.g.,  directional 
couplers  and  splitters  of  different  shapes.  For  a  rapid 
evaluation  of  each  device  topology  under  optimiza¬ 
tion  a  fast  scattering-matrix  approach  is  primarily 
used.  Fig.  18  shows  the  general  architecture  of  our 
optimization  platform  where  the  forward  solver  is 
allocated  by  the  hierarchical  representation  scheme  of 
the  underlying  problem. 

As  an  optimizer  we  consider  a  kind  of  evolu¬ 
tionary  strategy  ( ES)  scheme.  In  order  to  formalize 
the  optimizer’s  interference  during  optimization 
several  interference  operators  have  been  designed. 
Looking  for  appropriate  schemes  on  how  to  distort  a 
circuit  geometry  or  how  to  accordingly  modify  an 
element’s  functionality  represents  the  most 
demanding  part  of  our  implementation.  Besides 
translational  and  rotational  distortion  of  the  circuit 
while  maintaining  connectivity  other  operators  such 
as  scaling,  and  the  introduction  of  predefined 
functional  building  blocks  are  under  extensive 
investigation. 

Some  simple  preliminary  test  cases  like,  e.g.,  the 
optimization  of  a  multi-stage  resonant-coupler  add- 
drop  device  have  clearly  shown  that  the  optimization 
problem  posed  here  reveals  an  enormous  search 
space.  Even  when  assessing  a  2D  circuit  topology  to 
its  inherent  functionality  has  major  influence  on  the 
problem’s  complexity,  we  still  rely  on  our  approach: 
Including  semantic  information  like  the  circuit’s 
intrinsic  interrelations  within  an  optimization  process 
seems  the  only  way  to  keep  the  problem  tractable. 
Nevertheless,  we  believe  our  evolutionary  design 
environment  [37]  to  be  very  flexible  because  it  does 
not  necessarily  require  a  preliminary  design  as  a 
starting  configuration  and  even  allows  modifications 
of  the  problem  representation  during  the  optimization 
process  itself. 

9.  CONCLUSION 

By  means  of  five  design  examples  we  have 
demonstrated  why  evolutionary  algorithms  are  highly 
qualified  to  solve  “real  world”  inverse  problems 
considering  various  applications  in  the  field  of  planar 
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integrated  optics,  optical  communication  technology, 
and  dielectric  material  modeling  as  well.  The  modal 
treatment  of  optical  fields  by  an  appropriate 
underlying  structure  is  an  essential  operation 
regarding  the  characteristic  functionality  of  the  resul¬ 
ting  device.  Therefore,  we  have  presented  examples 
related  to  both  the  spectral  shaping  of  the  optical  field 
(single  mode  multi-cavity  laser  diodes  and  concatena¬ 
ted  Bragg  grating  filters)  and  the  spatial  molding  of 
the  light  (spot-size  converter). 

Leaving  the  field  of  structural  optimization  we 
focused  then  on  two  examples  stemming  both  from  an 
applied  engineering  background. 

First,  a  purely  combinatorial  optimization  prob¬ 
lem  solution  has  been  drawn  when  improving  the 
performance  of  modem  optical  communication 
systems  (e.g.,  fiber  optic  SCM-links  and  high-speed 
WDM-systems)  according  to  a  more  adapted 
frequency  (or  wavelength)  carrier  distribution.  In  the 
second  example  we  report  the  evolutionary 
algorithm ’s  parameter  estimation  feature  on  the 
determination  of  the  dispersive  properties  of  highly 
lossy,  very  complex  dielectric  materials  starting  from 
scattering  parameter  measurements. 

After  illustrating  the  various  examples,  the  focus 
of  this  paper  has  changed  towards  a  more  prospective 
view  where  the  evolutionary  algorithm’s  ability  to 
gather  problem-related  information  during  optimi¬ 
zation  is  addressed.  Here,  we  propose  to  benefit  from 
structural  interdependencies  within  a  population  of 
potential  solutions  as  well  as  to  trace  different 
temporal  evolution  aspects  in  order  to  establish 
corresponding  superior  meta-optimization  strategies. 

One  obvious  area  for  future  research  on  evolu¬ 
tionary  optimization  has  already  been  annotated  by 
the  improvement  of  the  forward  solver  with  respect  to 
speedup,  robustness  and  accuracy.  Moving  then  to  the 
circuit  level  we  tried  to  use  the  optimizer  as  a  proper 
design  tool  for  planar  integrated  optics  devices.  Here, 
we  have  faced  one  of  the  most  demanding  inverse 
problems.  It  seems  only  tractable  when  including  the 
circuit’s  intrinsic  interrelations  (by  a  semantic  analysis ) 
within  the  problem  representation  as  well  as  imple¬ 
menting  the  optimizer’s  interference  operators  accord¬ 
ingly.  Hence,  extensive  investigations  are  still 
mandatory.  Nevertheless,  we  propose  evolutionary 
algorithms  being  highly  valuable  candidates  when 
evaluating  codes  for  computer  guided  engineering  and 
virtual  design  platforms. 
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Abstract 

An  alternative  approach  to  the  design  of  an  array  antenna  to  be  used  to  generate  plane  waves  in  the  near 
field  is  presented.  The  original  array  was  designed  on  the  basis  of  a  triangular  grid  of  seven  elements 
arranged  in  a  hexagon,  to  minimize  the  number  needed  to  achieve  approximately  uniform  illumination 
of  the  test  zone,  under  the  assumption  of  isotropic  element  radiation  patterns.  In  the  alternative 
approach,  a  genetic  algorithm  was  used  to  discover  more  economical  distributions  of  elements  which 
could  still  generate  acceptable  approximations  to  a  plane  wave  zone.  It  was  found  that  considerable 
simplifications  from  the  ‘common  sense’  approach  were  possible. 

1.  Introduction 

The  desirable  incident  field  distribution  in  a  radiative  susceptibility  test  is  a  plane  wave,  existing  at 
least  over  a  test  zone  large  enough  to  enclose  the  equipment  under  test  (EUT).  A  susceptibility  test  is 
intended  to  seek  out  the  worst-case  response  of  the  EUT,  equivalent  to  finding  the  main  lobe  amplitude 
of  an  antenna,  and  such  a  measurement  is  relatively  tolerant  of  imperfections  in  the  quality  of  the  plane 
wave  zone.  Typical  accuracy  criteria  for  established  electromagnetic  compatibility  tests  of  this  type 
would  correspond  to  a  spread  in  the  field  amplitude  of  3dB  peak-to-peak  (often  up  to  6dB)  and  a  phase 
spread  of  90°  peak-to-peak.  This  is  in  contrast  to  the  situation  for  precision  antenna  measurements, 
where  deep  nulls  and  low  sidelobes  have  to  be  measured  in  close  proximity  to  the  main  lobe:  maximum 
amplitude  uncertainties  of  0.1  dB  and  phase  variations  of  22°  are  then  common  criteria.  The  quality 
criterion  on  the  plane  wave  zone  for  EMC  testing  is  thus  lower  than  that  for  antennas,  but  the  desired 
bandwidth  is  likely  to  be  greater  and  the  pressure  to  constrain  costs  greater. 

Test  facilities  for  antennas  which  create  a  local  plane  wave  region  in  the  near  field  (‘Compact  Ranges’) 
almost  always  use  illuminating  antennas  that  are  variants  on  standard  reflector  antenna  designs.  The 
same  principle  has  been  extended  to  EMC  testing,  with  the  modified  criteria  discussed  above,  but  its 
use  of  space  is  rather  uneconomical  for  many  purposes  [1].  To  overcome  this  deficiency,  the  use  of 
array  antennas  for  illumination  of  the  range  has  been  investigated,  with  some  success  [2].  The  array 
was  designed  on  the  basis  of  a  triangular  grid  of  seven  elements  arranged  in  a  hexagon.  This 
arrangement  was  chosen  intuitively  as,  in  principle,  it  minimizes  the  number  of  elements  needed  to 
achieve  approximately  uniform  illumination  of  the  test  zone,  under  the  assumption  of  isotropic  element 
radiation  patterns.  To  achieve  a  high-quality  plane  wave  zone,  it  is  necessary  to  feed  the  elements  with 
differing  signals  having  non-intuitive  ratios  of  relative  amplitude  and  relative  phase  and  this  greatly 
adds  to  the  cost  and  complexity  of  the  scheme.  These  signal  amplitudes  and  phases  have  to  be  found  by 
optimization  procedures  based  on  a  least-mean-squares  method  [2].  It  is  thus  desirable  that  the  number 
of  elements  in  the  array  be  reduced  by  a  systematic  procedure  that  can  still  guarantee  maintenance  of  a 
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plane  wave  test  zone  that  conforms  to  chosen  criteria  representing  an  acceptable  approximation  to  a 
local  plane-wave  zone. 

As  an  experiment  in  application  of  genetic  algorithm  (GA)  methods  to  antenna  design,  such  an 
approach  was  investigated  as  a  way  of  producing  a  thinned  array  design  for  an  EMC-quality  compact 
range  that  would  still  be  capable  of  generating  an  acceptable  approximation  to  a  local  plane  wave  over 
a  specified  test  zone.  The  method  requires  the  running  of  numerical  simulations  of  the  antenna  very 
many  times  over,  and  this  can  become  costly  in  use  of  computer  time.  To  minimize  this  requirement, 
the  array  elements  in  this  experimental  study  were  chosen  to  be  simple  dipoles:  the  behavior  of  an  array 
of  more  directive  elements,  such  as  log-periodic  antennas  will  not  be  significantly  different  in  the 
direction  of  the  test  zone,  since  their  main-lobe  amplitude  is  relatively  invariant  with  angle.  Clearly, 
there  will  be  great  differences  between  the  behavior  of  dipoles  and  directive  antennas  in  other 
directions,  but  these  are  not  of  importance  for  the  present  application. 

2.  Genetic  Algorithm  Implementation 

A  genetic  algorithm  has  the  following  general  form  [3,4]: 

1.  Create  a  population  of  N  random  individuals  (chromosomes). 

2.  Assess  the  performance  of  each  individual. 

3.  Rank  individuals  with  respect  to  performance  and  assign  a  fitness  value  dependent  on  ranking. 

4.  Select  M  individuals  (parents)  from  the  population  for  breeding,  the  probability  of  being  chosen 
being  proportional  to  fitness. 

5.  Randomly  pair  parents  and  crossover  parts  of  each  chromosome  (genes)  to  form  N  offspring. 

6.  Randomly  mutate  genes  in  the  offspring  chromosomes. 

7.  Assess  the  performance  of  each  new  individual  in  the  population  of  offspring. 

8.  Record  best  individual. 

9.  Repeat  from  step  3  for  required  number  of  generations. 

For  applications  in  electromagnetics,  steps  2  and  7  can  represent  vastly  larger  computational  tasks  than 
all  of  die  rest  put  together.  In  the  present  work,  the  industry-standard  program  NEC-2  [5]  was  used  for 
these  steps. 

2.1  Population  Representation  and  Initialization 

Genetic  algorithms  operate  on  a  number  of  potential  solutions  called  a  population.  The  population  is 
composed  of  a  number  of  individuals  (chromosomes),  which  contain  an  encoded  description  of  the 
parameters  (equivalent  to  ‘phenotypes’  in  biological  terminology)  to  be  optimized.  The  most 
commonly  used  method  of  encoding  phenotypes  is  as  binary  strings  [3],  which  are  concatenated  to 
form  a  chromosome. 

After  devising  a  suitable  encoding  scheme,  an  initial  population  of  chromosomes  (typically  around 
100)  is  randomly  generated. 
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2.2  The  Objective  and  Fitness  Functions 

The  chosen  objective  function,  O(x),  is  used  to  provide  a  measure  of  how  individuals  have  performed 
with  respect  to  the  problem  space.  The  individual  with  the  best  value  of  O(x)  is  assigned  a  rank 
position  of  N  and  the  worst  O(x)  is  assigned  a  rank  position  of  1.  Another  function,  called  a  fitness 
function  F(x),  is  then  used  to  transform  O(x)  into  a  measure  of  relative  fitness.  The  fitness  value  is 
assigned  according  to  the  rank  position,  px  of  individual  x.  The  fitness  function  is  then  derived  from  the 
rank  position  by  application  of  a  bias  or  selective  pressure  parameter,  B,  towards  the  most  fit 
individuals.  In  the  present  case  the  following  simple  linear  function  was  adopted: 


F(x)  = 


B(px  -1) 
N-l 


(1) 


Hence,  best-fit  individuals  will  have  a  fitness  function  equal  to  B  and  worst  fit  individuals  will  have  a 
fitness  function  of  zero. 

2.3  Selection 

Selection  is  the  process  of  determining  the  number  of  times  a  particular  individual  is  chosen  for 
reproduction  and,  thus,  the  number  of  offspring  that  it  will  produce.  The  simplest  selection  method  uses 
the  fitness  function  values  to  reject  a  percentage  of  the  population  that  performs  badly  [4].  A  better 
selection  technique  [6]  employs  a  roulette  wheel  selection  (RWS)  mechanism  to  select  individuals 
probabilistically.  In  roulette  wheel  selection  each  individual  in  the  population  has  a  roulette  wheel  slot, 
sized  in  proportion  to  its  fitness.  In  mathematical  terms  this  may  be  expressed  as  shown  in  Equation 
(2): 


Prob(x  selected)  =  (2) 

i=l 

A  real-valued  interval  is  determined  as  a  sum  (S)  of  the  fitness  values  over  all  the  chromosomes  in  the 
current  population  and  individuals  are  then  expressed  as  a  proportion  of  this  sum.  To  select  an 
individual,  a  random  number  is  generated  in  the  range  from  zero  to  S  and  the  individual  whose  segment 
spans  the  random  number  is  the  individual  to  be  selected.  This  process  is  then  repeated  until  the  desired 
number  of  individuals  has  been  selected. 

2.4  Mating  or  Crossover 

The  basic  operator  for  producing  new  chromosomes  in  genetic  algorithms  is  that  of  crossover.  Like  its 
counterpart  in  nature,  crossover  produces  new  individuals  that  have  some  parts  of  both  parents’  genetic 
material.  Several  crossover  strategies  exist,  each  with  their  associated  merits.  The  simplest  form  of 
crossover,  and  the  one  employed  here,  is  that  of  single  point  crossover  [6].  The  chromosomes  selected 
are  randomly  shuffled  and  then  paired  for  breeding.  A  crossover  point  is  randomly  selected,  dividing 
each  parent  chromosome  into  two  gene  strings  which  are  then  swapped  to  generate  two  new 
chromosomes  (offspring). 
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To  maintain  the  size  of  the  original  population,  the  new  individuals,  created  by  crossover  of  the 
selected  individuals,  must  be  reinserted  into  the  old  population.  This  was  achieved  by  creating 
sufficient  new  individuals  to  replace  the  least-fit  half  of  die  old  population.  The  most-fit  half  thus 
survives,  and  its  children  attempt  to  evolve  to  a  superior  form.  Once  a  new  population  has  been 
produced,  its  fitness  may  be  determined. 

2.5  Mutation 

In  natural  evolution,  mutation  is  a  random  process  where  a  gene  is  altered  to  produce  a  new  genetic 
structure.  In  genetic  algorithms,  mutation  is  randomly  applied  (with  a  low  probability,  typically  in  the 
range  0.001  to  0.01)  to  modify  elements  in  the  chromosomes.  The  role  of  mutation  is  to  enable  the 
recovery  of  good  genetic  material  that  may  have  been  lost  through  the  action  of  selection  and  cross¬ 
over  [3].  Many  variations  on  the  mutation  operator  have  been  proposed,  for  example,  biasing  the 
mutation  towards  individuals  with  lower  fitness  values  to  increase  the  exploration  in  the  search  without 
losing  information  from  the  fitter  individuals  [7],  or  parameterizing  the  mutation  such  that  the  mutation 
rate  decreases  with  the  population  convergence  [8]. 

2.6  Termination 

Because  the  genetic  algorithm  is  a  stochastic  search  method,  it  is  difficult  to  specify  convergence 
criteria.  As  the  fitness  of  a  population  may  remain  static  for  a  number  of  generations  before  a  superior 
individual  is  found,  the  application  of  conventional  termination  criteria  becomes  problematic.  A 
common  practice  [4]  is  to  terminate  the  GA  after  a  pre-specified  number  of  generations  and  then  test 
the  quality  of  the  best  members  of  the  population  against  the  problem  definition.  If  no  acceptable 
solutions  are  found,  the  GA  may  be  restarted  or  a  fresh  search  initiated. 

3.  Optimization  of  the  Geometry  of  an  Array  of  Five  Wire  Dipoles 

A  computer  program  was  developed  which  incorporated  the  major  features  of  a  GA,  as  outlined  above. 
In  addition,  the  software  was  developed  to  automatically  generate  input  files  in  NEC  format  and  then 
run  NEC-2  [5]  from  within  the  programming  environment.  For  computational  speed,  an  array  of  five 
half-wavelength  wire  dipole  antennas  was  initially  chosen  to  demonstrate  the  use  of  a  GA  for 
minimizing  the  normalized  error  in  plane  wave  synthesis. 

The  frequency  was  fixed  at  1GHz  and  a  test  zone  defined  as  a  cube  of  side  length  0.6m  (2A,)  with  the 
front  face  positioned  0.4m  from  the  array.  Element  locations  were  constrained  to  the  nodes  in  a  two- 
dimensional  grid  with  8x8  allowed  locations  and  a  spacing  of  0.5A  (to  avoid  overlapping  elements). 
The  number  of  combinations  in  which  it  is  possible  to  arrange  five  elements  in  the  64  locations, 
excluding  any  superpositions  of  elements  and  eliminating  all  patterns  that  are  identical  apart  from  a 
spatial  transformation,  is  approximately  7.6xl06  and  hence  use  of  an  exhaustive  search  technique  for 
finding  an  optimum  arrangement  was  infeasible. 

For  this  problem,  the  parameters  to  be  optimized  were  the  locations  of  each  of  the  five  array  elements. 
A  suitable  chromosome  structure  therefore  consisted  of  ten  phenotypes  as  shown  below: 
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chromosome  =  [(xi,yO  (x2,y2)  (x3,y3)  (X4,y4)  (xs,ys)]  (3) 

where  xn,yn  are  the  two-dimensional  co-ordinates  of  the  nth  array  element. 

1.  Setting  the  number  of  bits  (genes)  per  phenotype  to  be  3  led  to  a  problem  space  equivalent  to  an  8  x 
8  grid  and  a  total  chromosome  length  of  30  bits. 

2.  Setting  the  restriction  that  the  grid  spacing  was  to  be  0.5A,  led  to  a  problem  space  of  dimensions  3. 5 A. 
x  3 .5 A..  A  phenotype  of  value  000  was  made  to  correspond  to  a  value  of  -0.45m  and  a  phenotype  of 
value  111  made  to  correspond  to  0.6m.  The  asymmetry  is  a  function  of  the  3-bit  resolution  and  the 
fact  that  it  was  considered  desirable  that  one  element  had  the  potential  to  be  located  at  the  problem 
space  origin. 

3.  The  performance  of  each  individual  was  determined  by  first  calculating  the  excitation  weightings  of 
individual  array  elements  using  the  synthesis  methods  described  in  [2]  and  then  computing  the 
normalized  synthesis  error  (see  below).  This  was  adopted  as  the  Objective  Function  and  the  results 
ranked  from  ‘best’  (lowest)  to  ‘worst’  (highest). 

4.  Selection  of  the  most  fit  individuals  (those  having  the  lowest  numerical  value  of  the  normalized 
synthesis  error)  was  made  using  the  roulette  wheel  method  and  using  a  selective  pressure  of  B  =  2 
for  defining  the  fitness  function. 

5.  The  mutation  method  used  was  to  change  the  value  of  a  randomly  selected  gene  from  a  randomly 
selected  chromosome  at  each  generation. 

6.  The  number  of  chromosomes  per  population  was  chosen  to  be  100  and  the  algorithm  was  terminated 
after  100  generations. 

The  near  field  synthesis  procedure  [2]  involves  the  specification  of  a  three-dimensional  mesh  of  M 
points  within  the  test  zone.  A  set  of  excitations  for  the  elements  of  the  illuminating  array,  [f],  is  then 
derived  by  minimizing  the  deviations  between  the  resulting  electric  field  values  at  the  nodes  of  the 
mesh  and  the  values  that  would  be  present  if  the  field  distribution  was  a  perfect  plane  wave.  The 
process  may  be  represented  by  the  matrix  equation: 

[T][f]  =  [E]  *  [Eo]  (4) 

where  [f]  is  an  n-element  vector  of  complex  excitations  for  the  n  elements  of  the  array,  [E]  is  an  M- 
element  vector  of  the  resulting  electric  field  values  at  the  nodes  of  the  grid  in  the  test  zone,  [Eo]  is  a 
similar  vector  for  the  desired  plane  wave  and  [T]  is  the  interaction  matrix,  of  size  n  x  M.  The  elements 
of  [T]  can  be  found  by  using  an  electromagnetic  field  computation  program,  such  as  NEC.  The 
synthesis  algorithm  finds  values  for  the  elements  of  [f]  that  minimize  the  deviation  between  [E]  and 
[Eo]. 

The  normalized  synthesis  error  is  a  measure  of  the  quality  of  the  fit  of  the  synthesized  field  to  the 
desired  distribution.  It  is  the  normalized  summation  of  the  field  deviations  at  all  points  in  the 
discretisation  mesh  used  by  the  synthesis  algorithm  within  the  test  zone: 
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Synthesis  error  = 


M 

Z|Em-E0m 


m=l 


1VJ 

2iE 

m=I 


0m 


(5) 


where  Em  and  Eom  are  arbitrary  elements  of  the  vectors  [E]  and  [E0]  respectively. 

3.1  Results  Using  Synthesis  Method  with  Magnitude  and  Phase  Specified 

Using  the  synthesis  method  with  magnitude  and  phase  specified  [2],  a  genetic  algorithm,  as  described 
in  previous  sections,  was  initiated.  Figure  1  shows  how  the  synthesis  error  of  the  best-fit  individual 
varied  with  generation.  This  figure  highlights  the  difficulty  in  specifying  convergence  criteria  since  the 
synthesis  error  remains  static  for  an  unpredictable  number  of  generations.  The  optimized  element 
locations  are  shown  in  Figure  2  and  the  computed  element  excitations  are  listed  in  Table  1.  The 
optimized  geometry  is  two-dimensional  and  symmetrical  about  the  origin  with  each  element  spaced  at  a 
distance  of  one  wavelength  from  each  other  element.  The  resultant  geometry  is  perhaps  intuitively 
obvious;  however,  this  may  not  necessarily  always  be  the  case  for  larger  arrays  or  for  different  array 
patterns. 

A  sample  of  the  computed  x-component  of  the  electric  field  in  slices  throughout  the  quiet  zone  is 
shown  in  Figure  3  and  the  resultant  synthesis  error  and  the  worst  case  deviation  in  the  field  magnitude 
and  phase  throughout  the  entire  test  volume  are  also  summarized  in  Table  1.  The  deviations  are 
calculated  with  respect  to  an  ideal  plane  wave. 

Table  1  Summary  of  Element  Excitations,  Synthesis  Error  and  Maximum  Field  Deviation  for  a 
Genetically  Optimized  Array  using  Magnitude  and  Phase  Synthesis 


Element  Number 

Magnitude  (dB) 

Phase 

1 

0.00 

o 

O 

d 

2 

-4.90 

-41.8° 

3 

-4.90 

oo 

o 

4 

-3.95 

1 

OO 

d 

o 

5 

-3.95 

1 

oo 

d 

o 

Synthesis  error 

0.1530 

Magnitude  Deviation 

±4.9  dB 

Phase  Deviation 

±53° 

3.2  Results  using  Synthesis  Method  with  Magnitude  Only  Specified 

From  previous  studies  [2]  it  was  determined  that  a  synthesis  technique  with  magnitude  only  specified 
offered  the  best  method  for  minimizing  the  synthesis  error.  The  GA  was  thus  used  to  determine  if  a 
more  optimal  geometry  could  be  achieved  using  this  procedure.  As  an  aid  to  assessment  of  the 
performance  of  the  genetic  algorithm  to  optimize  the  plane  wave  quality,  a  benchmark  problem  was 
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proposed.  The  cross  geometry  shown  in  Figure  2  was  considered  suitable  for  comparison  purposes  and 
optimum  element  excitations  determined,  using  the  magnitude-only  synthesis  method,  for  the  test  zone 
specified. 

Figure  4  shows  how  the  synthesis  error  of  the  best-fit  individual  varied  with  generation.  The  synthesis 
error  for  the  benchmark  case  is  included  for  comparison  purposes.  It  is  clear  that  the  genetic  algorithm 
has  been  successful  in  reducing  this  error.  The  optimized  element  locations  are  shown  in  Figure  5  and 
the  computed  element  excitations  are  listed  in  Table  2. 

Table  2.  Summary  of  Element  Excitations  and  Synthesis  Error  for  Benchmark  and  Genetically 

Optimized  Arrays  using  Magnitude-Only  Synthesis 


Benchmark  Array  (Cross) 

Genetic  Array 

Element  Number 

Mag  (dB) 

Phase 

Phase 

1 

0.00 

o 

o 

o 

0.00 

o 

O 

O 

2 

-9.34 

-44.6° 

0.00 

o 

O 

O 

3 

-9.34 

-44.6° 

-1.84 

i 

to 

VO 

o 

4 

-8.87 

13.6° 

-1.84 

-2.9° 

5 

-8.87 

13.6° 

-8.96 

47.3° 

Synthesis  Error 

0.0442 

0.0239 

Magnitude  Deviation 

±4.24  dB 

±3.08  dB 

Phase  Deviation 

±61° 

±70° 

A  sample  of  the  computed  x-component  of  the  electric  field  in  slices  throughout  the  quiet  zone  is 
shown  in  Figure  6  for  the  benchmark  case  and  in  Figure  7  for  the  best-fit  genetically  optimized  array. 
The  resulting  synthesis  error  and  the  worst  case  variation  in  the  field  magnitude  and  phase  for  the  two 
cases  are  summarized  in  Table  2.  Comparing  the  results  for  the  cross  geometry  with  those  obtained  in 
Section  3.1,  where  the  excitations  had  been  optimized  using  the  magnitude  and  phase  synthesis 
method,  shows  that  an  improvement  in  the  normalized  synthesis  error  and  magnitude  deviation  is 
achieved  by  using  the  magnitude-only  method.  However,  the  phase  performance  is  shown  to  degrade 
somewhat. 

Comparing  the  results  for  the  cross  array  with  those  for  the  magnitude-only  genetically-optimized 
design  shows  that  there  is  an  improvement  in  the  field  magnitude  error  at  the  expense,  however,  of  the 
phase  uniformity.  This  is  not  unexpected  since  the  optimization  method,  in  this  case,  did  not  take  phase 
into  account  when  computing  the  synthesis  error. 

4.  Conclusions 

Genetic  algorithms  were  shown  to  be  able  to  derive  simplified  designs  for  an  illuminating  array 
antenna  of  a  plane-wave  generator  for  electromagnetic  susceptibility  testing.  Traditional  designs  had 
used  seven  elements,  whereas  genetic  optimization  showed  that  adequate  performance  could,  in 
principle,  be  achieved  with  five.  The  study  was  undertaken  as  a  proof-of-concept  exercise  using  plain 
dipoles  as  the  array  elements,  whereas  a  practical  array  would  use  log-periodic  elements.  Use  of 
dipoles  would  cause  difficulties  in  practice  due  to  generation  of  stray  radiation  away  from  the  test  zone, 
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but  within  the  test  zone  itself  the  behavior  of  dipole  and  log-periodic  elements  would  be  broadly 
similar.  The  two  genetically-derived  designs  studied  both  reached  the  optimum  configuration  in  less 
than  60  generations. 

The  design  that  was  derived  by  genetic  optimization  with  magnitude  and  phase  specified  was  of  a 
cross-shaped  configuration  that  was  similar  to  a  thinned  version  of  the  traditional  hexagonal  seven- 
element  design,  but  inherently  more  economical  due  to  the  use  of  only  five  elements.  The  configuration 
optimized  under  a  magnitude  constraint  only  was  closer  in  form  to  a  linear  array,  with  the  result  that 
phase  errors  in  the  test  zone  reached  70°,  although  the  amplitude  distribution  was  relatively  constant, 
showing  a  lower  maximum  deviation  than  could  be  achieved  with  the  cross  geometry.  The  excitation 
pattern  for  the  near-linear  array  might  be  seen  to  have  advantages  of  simplicity  in  some  realizations,  in 
cases  where  the  phase  error  can  be  tolerated.  However,  the  cross-shaped  geometry  is  likely  to  be  more 
generally  useful. 
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Fig.  2.  Genetically  optimized  element  locations  using  magnitude  and  phase  synthesis  method. 
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Fig.  3.  Magnitude  and  phase  variations  of  dominant  (x)  component  of  computed  electric  field 
strength  due  to  array  in  Fig.  2,  synthesized  by  magnitude  and  phase  method. 
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Fig.  4.  Variation  of  best  synthesis  error  with  generation:  magnitude-only  synthesis  method 
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Fig.  5.  Genetically  optimized  element  locations  using  magnitude-only  synthesis  method. 
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Fig.  6.  Magnitude  and  phase  variations  of  dominant  (x)  component  of  computed  electric  field 
strength  due  to  cross-shaped  array  in  Fig.  2,  synthesized  by  magnitude-only  method. 
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Abstract 

The  Genetic  Algorithm  (GA)  is  a  very  robust,  powerful  technique  that  is  capable  of  optimizing  designs  in  very  multimodal 
search  spaces.  However,  it  also  requires  significant  numbers  of  simulations  to  perform  such  optimizations.  If  the  simulations 
are  expensive,  as  in  the  case  of  antenna  design,  GAs  can  be  prohibitively  expensive  to  use.  A  clustering  technique  has  been 
investigated  which  cuts  the  required  number  of  function  calls  20-90%  with  minor  or  no  degradation  in  the  optimization 
quality.  In  this  technique,  a  GA  using  real-valued  genes  is  halted  when  the  population  has  clustered  around  portions  of  the 
search  space,  and  a  local  optimization  technique  completes  the  optimization  quickly.  This  method  has  been  applied  to  a 
variety  of  test  functions  and  wire  antenna  designs,  and  the  advantages  of  this  technique  seem  to  have  broad  applicability. 

1.0  Introduction 

Communication,  radar  and  remote  sensing  systems  employ  thousands  of  different  types  of  wire  antennas,  and  there  is  an 
increasing  need  for  high-performance,  customized  antennas.  However,  antenna  design  is  a  difficult  field  of  engineering. 
Antenna  designs  have  non-intuitive,  complicated  search  spaces,  and  problems  with  even  a  few  variables  are  highly 
multimodal.  In  addition,  most  antenna  simulations  require  a  significant  amount  of  time  to  run.  Typical  simulations  can  take 
anywhere  from  a  few  seconds  to  several  hours,  so  it  is  imperative  to  use  an  efficient  yet  robust  method  of  optimization. 

Genetic  algorithms  (GAs)  [1,  2]  are  currently  being  explored  with  great  success  as  a  way  to  automate  the  antenna  design 
process  [3].  GAs  are  well  suited  to  the  multimodal,  spiky  search  spaces  of  electromagnetic  problems.  Particularly  useful  is 
that  the  GA  does  not  require  an  initial  guess,  and  the  amount  of  design  information  the  engineer  must  supply  can  be  very 
minimal. 

In  spite  of  their  success,  GAs  with  conventional  convergence  criteria  require  too  many  cost  function  evaluations  for  many 
antenna  design  problems.  This  research  investigates  using  the  clustering  behavior  of  real-valued  genes  during  a  GA 
optimization  as  a  way  to  determine  convergence — a  method  that  significantly  enhances  efficiency. 

A  GA  begins  with  a  random  distribution  of  points  across  a  search  space.  As  the  GA  run  progresses,  order  begins  to  appear  in 
the  population.  For  many  optimization  problems,  the  initial  random  distribution  begins  to  cluster  around  certain  points  in  the 
search  space,  and  gene  values  begin  to  show  organization,  first  in  multi-modal,  then  unimodal,  distributions  as  the  GA 
converges.  Once  gene  value  distributions  become  clustered  around  points  in  the  search  space,  the  GA  has  probably  found  a 
number  of  hills  which,  barring  unusually  useful  mutations,  the  GA  will  slowly  begin  to  exploit.  Members  of  the  population 
that  are  fit  enough  to  survive  will  generally  be  from  one  of  these  peaks.  Peaks  with  individuals  of  greater  fitness  will  gain 
more  population  members,  and  eventually  die  entire  population  will  exist  on  a  single  peak  and  then  a  single  point. 

The  GA  can,  however,  be  stopped  as  soon  as  the  population  has  divided  itself  into  a  number  of  discrete  clusters.  A  local 
optimizer  can  then  be  applied  to  each  cluster.  Because  this  clustering  can  occur  early  in  the  GA  run,  many  cost  function 
evaluations  can  be  saved,  usually  with  minor  or  no  impact  on  the  optimization  results. 

1.1  Real  GAs  and  Adewuya’s  Method 

The  reader  is  probably  familiar  with  binary  GAs,  in  which  all  parameters  are  encoded  into  a  string  of  bits  called  a 
chromosome.  Any  continuous  parameters  must  be  discretized,  which  means  that  resolution  becomes  a  factor.  The  crossover 
processes  for  these  GAs  are  also  straightforward,  involving  swapping  bits  in  some  fashion  to  create  children.  However, 
previous  research  [4]  has  shown  that  real-valued  GAs,  where  each  gene  in  a  chromosome  is  a  real  number,  coupled  with 
special  crossover  techniques,  are  much  better  at  optimizing  problems  with  all  or  nearly  all  parameters  continuous. 

These  special  crossover  techniques  involve  the  use  of  interpolation  and  extrapolation  to  create  children.  The  method  used  by 
the  authors  was  first  investigated  in  [5],  and  is  called  Adewuya’s  method.  Adewuya’s  method  consists  of  a  sequence  of 
crossover  methods  applied  to  real  genes.  First,  quadratic  crossover  is  applied,  where  the  child’s  gene  is  taken  from  a 
predicted  minimum  of  a  quadratic  curve  fit  using  three  parents.  If  quadratic  crossover  fails,  heuristic  crossover  is  applied, 
which  pulls  the  child’s  gene  from  a  range  predicted  to  be  better  than  two  parent’s  genes.  See  Figure  1  for  a  graphical 
representation  of  what  happens  in  these  two  methods. 
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Figure  1.  Quadratic  and  heuristic  crossover.  Fitness  is  to  be  minimized  in  these  examples. 


If  both  quadratic  and  heuristic  crossover  fail,  the  child’s  gene  is  one  of  the  parent’s  genes  taken  at  random.  This  process  is 
applied  gene  by  gene  to  create  a  new  child.  See  [4,  5,  6]  for  a  more  complete  explanation  and  comparisons  with  other 
methods.  This  method  has  been  found  to  be  particularly  powerful  in  electromagnetics  and  mechanical  engineering. 


Mutation  for  the  real  valued  GA  can  take  many  different  forms.  The  one  used  here  was  Gaussian — mutated  genes  were 
pulled  from  a  distribution  with  a  mean  equal  to  the  unmutated  gene,  and  a  standard  deviation  of  0.1  of  the  full  gene  range. 

Each  gene  varied  over  the  same  range.  We  chose  this  range  to  be  from  0  to  1.  Each  gene  is  translated  into  parameter  values  as 
appropriate,  and  can  cover  very  different  ranges  in  the  design  space.  However,  normalizing  the  gene  values  in  this  way 
allows  the  accurate  calculation  of  the  genetic  distance  between  individuals  using  Euclidean  geometry,  a  very  important 
quality  when  determining  the  clusters  in  a  population. 

Regarding  other  GA  parameters,  mating  selection  was  accomplished  via  the  weighted  roulette  wheel  method  of  [2],  and  a 
steady-state  GA  was  used,  in  which  the  parents  of  the  next  generation  are  the  best  of  a  specified  percentage  of  the  total 
population.  This  percentage  is  called  the  overlap,  for  it  is  the  portion  of  each  generation  that  carries  over  to  the  next.  This 
type  of  GA  has  proved  to  converge  quickly,  a  feature  necessary  to  accommodate  the  costly  simulation  time  of  antenna 
designs.  Fitness  scaling  was  also  used  for  the  weighted  roulette  wheel,  basing  the  amount  of  the  roulette  wheel  given  to  an 
individual  on  the  difference  between  the  scores  of  that  individual  and  the  worst  parent  carried  over  from  the  previous 
generation. 


1.2  Wire  Antenna  Design 

Since  F.  Braun  created  the  first  wire  antenna  in  1898,  a  variety  of  wire  antennas  have  appeared:  monopoles  (e.g.,  car  whip 
antennas),  log-periodic  antennas  (e.g.,  rooftop  TV  aerials),  helix  and  spiral  antennas,  and  a  host  of  other  types.  In  recent 
years,  GAs  have  shown  sufficiently  powerful  to  optimize  even  very  challenging  designs  for  unusual  applications  [3, 6, 7, 8]. 
Following  is  a  definition  of  several  antenna  design  terms  that  are  important  in  this  paper. 

Directivity  and  gain  are  two  related  qualities  in  antenna  design.  Directivity  is  the  ratio  of  power  density  being  transmitted  by 
an  antenna  in  a  particular  direction  to  the  average  power  density  being  transmitted  in  all  directions.  The  gain  is  the  directivity 
multiplied  by  the  ratio  of  power  radiated  to  power  input.  Gain  takes  into  account  the  losses  due  to  resistance  in  the  antenna, 
which  converts  some  of  the  input  power  into  heat.  When  the  losses  are  considered  to  be  zero,  as  in  this  paper,  the  directivity 
and  gain  are  equal. 

Gain  is  usually  expressed  in  decibels  (dB),  which  relates  to  a  ratio  of  power  or  power  densities  by  the  following  expression: 
dB  =  101og10(Pi/P2)-  In  the  case  of  gain,  P2  is  the  power  density  of  an  isotropic  radiator  that  transmits  power  equally  in  all 
directions.  The  abbreviation  dBi  refers  to  gain  compared  with  an  isotropic  radiator.  However,  the  “i”  is  sometimes  left  off, 
and  is  understood  from  context. 


A  gain  pattern  or  antenna  pattern  plots  gain  magnitude  versus  angle,  showing  the  proportion  of  power  an  antenna  transmits 
in  a  particular  direction.  For  2-D  antennas,  or  antennas  symmetric  in  the  third  dimension,  this  angle  is  simply  the  elevation 
angle  0.  In  3-D,  there  are  two  angles  that  specify  a  direction:  0  and  the  azimuth  ({).  Figure  2  shows  these  angles  on  a  set  of 
axes.  An  antenna  is  considered  to  be  directive  if  its  gain  pattern  is  heavily  weighted  in  one  direction. 
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Figure  2.  0  and  (J)  on  a  3-D  axis  system.  Arrows  begin  where  NEC2  defines  0  degrees  for  0  and  <(). 

A  ground  plane — at  its  simplest  a  large,  flat  metal  plate  underneath  the  antenna — is  often  used  in  conjunction  with  a  wire 
antenna.  It  acts  as  a  mirror  for  the  antenna  above  it,  and  therefore  changes  the  antenna  gain  pattern.  A  ground  plane  can 
decrease  the  height  and/or  simplify  the  construction  of  the  wire  antenna.  The  hood  or  roof  of  a  car  acts  as  a  ground  plane,  and 
antennas  that  will  be  affixed  to  such  places  need  to  be  designed  for  use  with  one. 

There  are  several  electromagnetic  simulators  that  exist  for  wire  antennas.  One  particularly  suited  to  the  task  of  creating  a 
general  antenna  synthesis  system  is  the  Numerical  Electromagnetics  Code,  Version  2  (NEC2)  [9].  This  code  was  used 
exclusively  on  this  research.  NEC2  has  a  simple  file-interface  for  input  and  output  that  makes  it  ideal  for  using  with  an 
optimizer.  The  code  is  in  the  public  domain,  so  obtaining  and  modifying  the  source  code  is  cost-free  and  easy,  as  is  copying 
the  simulator  between  machines.  But  perhaps  most  important,  it  has  a  long  track  record  of  being  accurate.  The  NEC2  code 
was  produced  in  the  early  1980s,  and  has  been  used  it  to  simulate  antenna  structures  for  many  years.  It  has  shown  itself  to  be 
in  very  good  agreement  with  actual  measurements,  and  thus  one  can  have  more  confidence  that  answers  received  from 
simulation  have  validity. 

There  are  three  antennas  that  will  be  discussed  in  this  paper:  a  two-wire  Yagi  antenna,  a  loaded  monopole,  and  a  14-wire 
Yagi  antenna. 

1.2.1  The  Two- wire  Yagi 

The  Yagi  antenna  is  a  series  of  parallel  wires,  first  proposed  by  Prof.  Yagi  and  his  student  S.  Uda  in  the  late  1920s.  One 
element  is  driven,  one  element  is  behind  the  driven  element  and  is  called  the  reflector,  and,  usually,  there  are  other  elements 
in  front  of  the  driven  element  called  directors.  The  highest  gain  can  be  achieved  along  the  axis  and  on  the  side  with  the 
directors.  The  reflector  acts  like  a  small  ground  plane,  allowing  power  that  would  otherwise  be  sent  backward  to  be  reflected 
forward. 

In  this  case,  there  are  no  directors — only  the  reflector  and  the  driven  element.  This  gives  a  two-dimensional  problem,  as 
shown  in  Figure  3.  The  chromosome  for  this  antenna  is  two  real  genes,  encoding  length  and  separation  respectively. 


Driven  element  0.5  X 


Drive  point 
(in  center  of  element 


Separation  distance 
0.04  -  2  A, 


t 

Reflector  element  0  -  4  X 

Figure  3.  Two-element  Yagi  antenna  search  space.  X  =  1  wavelength 

In  spite  of  the  fact  that  there  are  only  two  variables,  the  response  surface  is  very  multi-modal,  as  shown.  This  behavior  is 
typical  for  electromagnetic  problems,  which  are  usually  filled  with  local  minima.  This  behavior  shows  why  GAs  are  one  of 
die  most  powerful  techniques  for  solving  these  problems — its  parallel  sampling  of  the  search  space  makes  it  able  to  resist 
many  of  the  local  minima. 
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Figure  4.  Response  surface  for  gain  vs.  separation  and  reflector  length. 

The  goal  for  this  antenna  is  to  maximize  the  forward  gain,  so  the  objective  function  for  this  antenna  is  simply  the  gain.  As 
can  be  seen  on  the  graph,  the  best  parameter  settings  to  maximize  gain  are  a  length  of  about  0.48X  and  a  separation  of  about  * 
0.14L  The  figure  below  shows  what  the  antenna  pattern  looks  like  near  this  maximum. 


RADIATION  PATTERN 


Figure  5.  Radiation  pattern  of  an  antenna  near  the  maximum 

GAs  optimizing  this  antenna  show  clustering  in  a  very  clear  way,  as  will  be  described  in  the  next  section.  But  first,  the  other 
wire  antennas,  the  loaded  monopole  and  the  14-wire  Yagi,  need  explanation. 

1.2.2.  The  Loaded  Monopole 

A  monopole  loaded  with  a  modified  folded  dipole  has  been  previously  investigated  [10].  It  has  a  search  space  as  shown 
below. 


Figure  6.  The  loaded  monopole  search  space 
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The  chromosome  for  this  antenna  is  six  real-valued  genes,  encoding  Z1  through  Z4,  then  XI  and  X2.  However,  the  ordering 
makes  no  difference,  because  the  crossover  techniques  described  in  section  1.1  are  applied  separately  for  each  gene. 

This  antenna  is  capable  of  having  even  coverage  over  the  upper  hemisphere  given  the  proper  set  of  parameters  [8].  The 
resulting  pattern  for  one  such  configuration  is  shown  in  Figure  7. 


6(deg.) 

Figure  7.  Folded  monopole  pattern  and  corresponding  optimized  design. 

What  is  unusual  is  that  the  shape  is  so  asymmetric.  This  asymmetry  was  an  unexpected  result,  but  further  study  showed  it  to 
be  necessary  to  achieve  the  very  flat  pattern  shown  in  Figure  7. 

The  objective  function  for  this  antenna  is  the  sum  of  the  squares  of  the  deviation  of  all  calculated  gains  from  the  mean.  In 
equation  form: 

Fitness  =  2over  aii  e.*(Gain(0,<|>)  -  Avg.  Gain)2 . 

The  GA's  goal  is  to  minimize  this  function. 


1.2.3.  The  14- wire  Yagi 

The  14-wire  Yagi  antenna  is  a  more  traditional  Yagi  antenna  than  the  two-wire  Yagi  above,  with  a  reflector,  driven  element, 
and  12  directors  as  shown  below.  This  antenna  optimization  is  the  most  challenging  of  all  the  examples,  with  28  variables, 
multiple  criteria,  and  a  difficult,  sensitive  search  space.  It  will  show  whether  the  clustering  technique  described  in  the  paper 
will  work  on  a  truly  difficult  problem. 
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Driven  Element 


Figure  8.  14-wire  Yagi  design. 


The  real- valued  chromosome  consists  of  14  length  genes,  13  spacing  genes,  and  a  gene  for  wire  diameter.  Each  wire  length  is 
allowed  to  vary  between  0.0  X  (effectively  removing  the  element)  and  0.75X.  They  are  constrained  to  be  symmetric. 

The  spacing  between  wires  is  constrained  to  be  greater  than  0.05X.  However,  the  boomlength  is  constrained  to  be  3.60A,,  so 
the  14  wires  are  spaced  along  this  length  as  follows:  the  values  of  the  genes  corresponding  to  the  spacings  are  totaled,  then 
the  boomlength  is  divided  by  this  total.  This  result  is  multiplied  by  each  spacing  gene  value  to  give  the  required  spacing 
between  each  pair  of  wires.  The  last  variable  is  the  wire  diameter,  which  is  allowed  to  vary  between  0.004A.  and  0.012^.. 

The  criteria  were  VSWR  and  endfire  gain.  The  score  was  given  by: 

Score  =  G  -  Ci  x  (VSWR) 

where  G  is  the  endfire  gain  and  Ci  is  10  when  the  VSWR  is  greater  than  3.0  and  0.50  when  the  VSWR  is  less  than  3.0.  (It 
should  also  be  noted  that  (VSWR-1.0)  is  used  instead  of  VSWR  when  it  is  less  than  3.0  to  further  decrease  the  importance  of 
this  factor  on  the  score.)  The  objective  was  to  maximize  the  score. 

These  three  antennas  will  show,  on  a  preliminary  level,  the  applicability  of  the  clustering  technique  described  in  the  next 
section. 

2.0  The  Clustering  Method 

GAs  usually  begin  with  a  randomly  generated  population,  scattered  stochastically  around  the  search  space.  As  survival  of  the 
fittest  is  applied,  the  population  quickly  begins  to  avoid  unfruitful  areas.  Then,  the  population  begins  to  cluster  around  certain 
places  in  the  search  space.  What  is  happening  is  that  those  regions  are  loci  of  good  fitness,  and  individuals  produced  within 
them  are  viable — i.e.,  they  will  have  sufficient  fitness  to  survive.  Those  that  are  produced  outside  of  these  regions  will 
probably  not  have  enough  fitness  to  survive  once  the  population  is  firmly  clustered  around  these  points.  This  effect  can  also 
be  regarded  as  speciation,  for  intraspecies  individuals,  likely  to  remain  inside  a  cluster,  will  survive,  while  interspecies 
individuals,  likely  to  fall  outside  of  any  cluster,  will  perish. 

One  the  population  is  clustered,  there  will  be  little  exploration  of  the  search  space.  What  will  happen  is  a  “battle”  in  which 
the  clusters  fight  for  individuals.  The  better-scoring  clusters  will  generally  receive  more  of  the  new  children,  and  as  the  scores 
increase,  the  lesser  clusters  will  lose  individuals,  finally  dying  off  one  by  one  until  only  one  cluster  is  left. 

Following  is  a  graphical  representation  of  this  process,  taken  from  an  optimization  of  the  two- wire  Yagi  antenna  using  a  real¬ 
valued  GA. 
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Figure  9.  An  example  of  clustering  in  the  case  of  the  2-wire  Yagi.  From  [6], 

Generation  1  is  randomly  scattered  throughout  the  space.  As  can  be  seen,  the  worst  areas  are  avoided  even  beginning  with 
generation  2,  and  by  generation  8  the  population  is  clustered  around  three  points.  From  generation  8  through  23,  nothing 
happens  except  gradual  extinction  of  clusters  until  only  the  best  remains.  The  GA  requires  just  as  many  resources  during  this 
last  process  as  it  did  when  it  was  very  effectively  finding  good  regions  in  the  search  space. 

It  makes  intuitive  sense,  then,  that  this  battle  is  simply  a  waste  of  resources.  Why  not  stop  the  GA  when  the  population  is 
clustered,  and  use  a  local  optimizer  on  one  or  more  of  the  clusters,  since  each  cluster  is  probably  a  single  peak  in  the  search 
space? 

The  challenge  in  applying  this  idea  is  finding  an  automated  technique  that  can  detect  when  the  population  is  properly 
clustered.  Though  there  are  many  ways  to  determine  this  process,  a  simple  approach  was  taken  in  this  research,  which 
involved  using  a  threshold  value  for  cluster  radius,  similar  to  [11]. 

To  start  the  first  cluster,  the  two  closest  individuals  in  the  population,  as  determined  by  Euclidean  distance,  are  clustered,  if 
they  are  closer  than  the  cluster  threshold.  The  center  point  between  them  is  calculated,  then  the  nearest  individual  to  this 
center  point  is  added  if  its  distance  is  less  than  the  cluster  threshold.  The  new  center  point  of  the  cluster  is  calculated,  the 
next-closest  individual  added,  etc.,  until  there  are  no  other  individuals  within  the  cluster  threshold  distance  from  the  cluster 
center. 

The  closest  pair  of  individuals  not  already  clustered  is  then  checked  to  see  if  the  distance  between  them  is  less  than  the  cluster 
threshold.  If  it  is,  then  a  new  cluster  is  formed  in  the  manner  of  the  first  one.  This  process  continues  until  there  are  no 
unclustered  individuals  closer  to  each  other  than  the  cluster  threshold. 

Once  a  specified  percentage  of  parents  is  clustered,  the  GA  is  halted.  As  will  be  shown,  this  percentage  makes  a  large 
difference  on  the  effectiveness  of  this  procedure,  for  if  one  halts  the  GA  before  a  sufficient  number  of  parents  are  clustered, 
the  local  optimization  will  not  be  very  effective,  for  the  best  peak  has  not  been  sufficiently  defined. 

In  addition,  an  elitist  cluster  routine  was  found  to  be  the  most  effective.  An  elitist  routine  is  one  that  specifies  that  regardless 
of  the  percentage  of  the  parents  that  are  clustered,  the  GA  will  not  be  halted  until  the  best  individual  is  clustered  as  well.  A 
study  comparing  the  elitist  and  non-elitist  routines  showed  far  better  results  with  small  additional  computational  expense  for 
the  elitist  routine.  This  result  is  intuitive,  for  if  the  best  individual  is  not  in  a  region  with  a  cluster,  there  is  a  good  likelihood 
that  the  GA  is  not  done  exploring  the  space  yet  and  there  will  still  be  some  shifting  in  clusters  before  it  is  ready  to  be  halted. 

It  was  initially  thought  that  the  local  optimizer  might  be  most  effective  if  it  operated  on  the  center  of  the  cluster,  as  opposed 
to  the  best  individual  from  the  cluster.  Study  showed  this  was  not  the  case;  results  were  disappointing  from  the  cluster  center, 
but  were  very  good  from  the  best  individual.  For  this  reason,  the  score  of  the  cluster  is  taken  as  the  score  of  its  best 
individual,  and  that  individual  is  passed  to  the  local  optimizer  when  the  GA  is  halted. 

Before  tuning  the  method,  it  was  not  known  if  one  needed  to  optimize  all  clusters  to  be  reasonably  certain  of  getting  the  best 
answer,  or  if  it  was  sufficient  to  optimize  only  the  best  cluster.  We  were  surprised  to  learn  that  optimizing  only  the  best 
individual,  which  is  contained  in  the  best  cluster  by  default,  is  sufficient  to  produce  excellent  results.  On  rare  occasions,  the 
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second-best  cluster  actually  was  located  on  the  best  peak,  but  this  happened  so  infrequently  (less  than  5%  of  the  time)  that  the 
extra  function  calls  necessary  to  optimize  the  second-best  peak  were  deemed  not  worth  the  expense.  However,  this  behavior 
needs  to  be  explored  for  more  problems,  for  it  is  conceivable  that  more  complex  problems  in  very  spiky  search  spaces  may 
show  greater  benefit  when  the  less-fit  clusters  are  optimized. 

Though  this  process  is  tied  to  a  specified  cluster  threshold,  its  effectiveness  seems  universal,  and  seems  to  be  more  effective 
with  more  difficult  problems.  The  results  of  our  experiments  with  this  method  will  now  be  discussed. 

3.0  Results 

In  this  section,  the  effectiveness  of  the  clustering  routine  is  discussed  for  many  different  problems,  including  simple  test 
functions,  the  two-wire  Yagi,  the  loaded  monopole,  and  the  14-wire  Yagi.  The  routine  seems  to  be  effective  on  both  simple 
and  complex  problems,  as  will  be  shown. 

In  order  to  compare  the  clustering  method  with  a  standard  GA,  a  standard  baseline  GA  needed  to  be  created.  This  GA  has  the 
following  convergence  criteria,  which  are  not  particularly  unusual:  halting  after  the  best  individual  has  been  static  for  1 1 
generations,  or  when  the  range  of  values  present  in  the  parents  for  each  gene  fall  within  1%  of  the  total  gene  range.  After  the 
GA  is  halted,  a  conjugate  gradient  local  optimizer,  the  same  as  is  used  for  the  clustering  method,  is  used  to  optimize  the  best 
individual. 

3J  Test  functions 

These  functions  were  used  to  create  and  debug  the  clustering  routine,  though  they  could  not  be  used  to  fine-tune  the  routine 
because  they  are  so  simple.  However,  they  do  show  that  the  clustering  routine  is  effective  even  for  simple  problems,  and  are 
included  for  completeness. 

Three  simple  test  cases  were  used.  The  first  test  case  is  De  Jong’s  F5  [12],  shown  below.  It  has  two  dimensions,  and  a 
maximum  value  of  1.002. 


Figure  10.  De  Jong’s  F5  test  function 

The  second  test  case  is  a  low-modality  sinusoidal  function,  given  by  the  equation: 
Score  =  Ii=i  6(sin(7tXi)-cos(37tXi)) 
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Figure  1 1.  One  dimension  of  sinusoidal  test  function  #1 

This  function  was  tested  in  six  dimensions,  which  has  a  maximum  value  of  1 1.272. 

The  third  test  case  is  a  more  challenging  sinusoidal  function,  tested  in  10  dimensions,  has  a  maximum  value  of  20.0.  Its 
equation  is:  Score  =  £i=1 10  i0(sin(7tXi)-cos(107tXi)) 


Figure  12.  One  dimension  of  sinusoidal  test  function  #2 

Each  test  case  was  tried  over  a  range  of  population  sizes  (25,  50,  75,  100,  150,  200)  and  overlaps  (0.25,  0.5,  and  0.75).  The 
results  were  averaged  over  all  combinations  of  these  two  variables,  to  determine  the  overall  effect  of  the  method  on  results 
without  bias  toward  a  particular  population  or  overlap  value.  The  results  of  these  experiments  are  contained  in  the  table 
below.  _ _ _ _ _ 


Average  score 

Average  objective  function  calls  1 

Baseline 

Cluster  method 

%  difference 

Baseline 

Cluster  method 

%  difference 

F5 

0.789 

-8.3% 

693 

459 

-33.7% 

10.9 

11.0 

0.3% 

1236 

669 

-45.9% 

18.9 

17.9 

-5.1% 

1763 

1358 

-23.0% 

Table  1.  Results  from  test  functions 


For  the  F5  test  function,  the  clustering  method  loses  8.3%  in  score  while  decreasing  function  calls  by  33.7%,  and  both 
changes  are  statistically  significant  as  shown  by  the  student’s  t-test  statistic. 

The  2nd  test  case  performed  quite  well.  There  was  a  statistically  insignificant  difference  in  the  score  between  the  clustering 
method  and  the  baseline  GA,  while  the  decrease  in  function  evaluations  was  45.9%! 

For  the  third  case,  the  clustering  technique  took  a  loss  of  5.1%  on  average  score  while  providing  only  a  23.0%  gain  in 
efficiency.  This  is  not  particularly  spectacular,  though  it  is  significant. 
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All  three  test  cases  showed  varying  degrees  of  improvement  in  runtime,  between  about  20%  and  50%,  and  varying  degrees  of 
change  (generally  degradation)  in  optimization  quality,  between  0.3%  and  -10%.  However,  the  real  test  of  the  method  is  in 
solving  actual  design  problems,  which  will  now  be  discussed. 

3.2  The  Two-wire  Yagi 

Two  experiments  were  run  with  the  two-wire  Yagi:  a  parameter  tuning  experiment,  and  a  confirmation  of  the  clustering  effect 
over  a  wide  range  of  population  sizes  and  overlaps. 

3.2.1  Fine-tuning  the  clustering  parameters 

Though  simple  test  cases  showed  improvement  with  this  method,  they  were  too  simple  to  use  in  fine-tuning.  The  parameters 
for  this  method  are  the  percentage  of  parents  to  be  clustered  before  halting  the  GA  and  the  cluster  threshold  (also  called 
cluster  size).  Each  was  tuned  preliminarily  using  the  test  functions  above.  However,  the  parameter  values  that  worked  for  the 
test  functions  did  not  work  at  all  well  for  the  two-wire  Yagi.  Therefore,  an  experiment  was  run  to  determine  the  best  values 
for  these  parameters  for  this  more  realistic  engineering  problem. 

The  data  points  shown  in  Table  4  are  the  average  performance  over  three  population  sizes  (25,  50  and  100)  and  two  overlaps 
(0.25  and  0.5),  which  gives  a  broad  indication  of  its  effectiveness.  The  Cluster  threshold  (which  is  the  maximum  Euclidean 
distance  between  any  two  members  in  the  cluster)  was  varied  between  0.1  and  0.3,  and  the  percentage  of  the  parents  that 
were  required  to  be  clustered  varied  from  70%  and  90%.  The  resulting  average  scores  and  number  of  objective  calls  required 
to  complete  the  optimization  are  shown  below. 


Cluster 

threshold 

%  clustered 

Avg.  score 

Avg.  objective 
function  calls 

0.1 

70% 

5.25 

703 

0.3 

70% 

5.55 

523 

0.1 

90% 

6.40 

650 

0.3 

90% 

5.29 

592 

Table  2.  Cluster  parameter  experiment 


The  results  show  that  the  best  scores  resulted  from  a  tight  clustering  threshold,  and  as  large  a  percentage  of  the  parents 
clustered  as  possible  before  halting  the  GA.  Though  these  settings  do  not  give  the  best  time  savings,  the  difference  in  scores 
make  the  extra  simulations  worthwhile.  These  settings  make  intuitive  sense  as  well,  for  if  the  clusters  are  too  large,  the 
cluster  may  actually  cover  more  than  one  local  minimum,  causing  the  local  optimizer  to  fail.  In  addition,  if  some  parents  are 
not  clustered,  that  means  that  some  viable  individuals  are  alone  in  their  region  of  the  search  space,  and  they  are  probably  on 
some  sort  of  peak  that  should  be  investigated  before  halting  the  GA. 

Further  investigation  showed  that  increasing  the  parent  percentage  clustered  gave  still  better  results,  thus  the  parameter 
settings  that  were  used  for  the  rest  of  the  experiments  with  clustering  were  99%  of  parents  clustered  and  0.1  cluster  threshold. 

3.2.2  Confirming  the  effectiveness  of  the  clustering  method 

Another  full-factorial  experiment  shows  the  effectiveness  of  the  clustering  method  on  saving  objective  function  calls  while 
not  significantly  disrupting  performance.  The  results  are  shown  below. 

As  with  the  test  cases,  the  baseline  and  clustering  method  were  tried  over  a  range  of  population  sizes  (25,  50,  75,  100,  150, 
200)  and  overlaps  (0.25,  0.5,  and  0.75).  The  results  were  averaged  over  all  combinations  of  these  two  variables,  to  determine 
the  overall  effect  of  the  method  on  results  without  bias  toward  a  particular  population  or  overlap  value.  The  results  of  these 
experiments  are  contained  in  the  table  below. 


|  Average  score 

Average  objective  function  calls  | 

Baseline 

Cluster  method 

%  difference 

Baseline 

Cluster  method 

%  difference 

6.241 

6.395 

2.5% 

884 

415 

-53.0% 

Table  3.  Comparison  of  the  baseline  GA  and  the  GA  using  the  clustering  method  for  the  two-wire  Yagi. 


A  student’s  t-test  showed  the  difference  in  the  baseline  and  clustering  GA  average  scores  were  statistically  insignificant,  with 
a  44.5%  probability  it  arose  by  chance.  On  the  other  hand,  the  difference  in  objective  function  calls  is  so  significant  that  there 
is  less  than  a  0.002%  chance  that  it  occurred  by  accident.  The  experiment  also  showed  the  best  predictor  of  score 
performance  was  not  clustering  but  population  size.  The  data  shows  that  the  larger  the  population,  the  better  the  score,  at  least 
to  the  200  individual  population  size.  (Incidentally,  previous  research  [6]  has  shown  that  too  large  a  population  can  actually 
decrease  performance.) 
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Of  course,  the  major  significance  of  these  results  is  that  the  clustered  GA  requires  less  than  50%  of  the  function  calls  that  the 
baseline  does,  for  essentially  no  change  in  score.  This  is  a  phenomenal  result,  but  the  next  design  case  shows  even  greater 
improvement. 

3.3  The  Loaded  Monopole 

Using  the  tuned  parameters  of  99%  of  parents  clustered  and  0.1  cluster  threshold  size,  the  loaded  monopole  was  optimized 
using  the  clustering  method.  A  comparison  of  the  two  GAs  follows,  with  population  sizes  and  overlaps  chosen  for  each  at 
their  optimal  point  as  tuned  by  the  two- wire  Yagi  experiment: 


Population 

Overlap 

Average 

Objectives 

Average 

Score 

Baseline 

200 

0.42 

17736 

18.1 

Clustering 

200 

0.25 

2078 

67.3 

Table  4.  Baseline  vs.  Clustering  GA  performance  for  the  loaded  monopole 

The  baseline  case  was  run  6  times,  the  clustering  case  5.  Both  methods  achieved  very  good  designs,  but  there  is  an  88.3% 
savings  in  objective  function  calls  using  the  clustering  method!  However,  there  is  a  statistically  significant  increase  in  the 
score  for  the  clustered  case.  Recall  that  this  objective  function  is  to  be  minimized,  with  the  ideal  being  zero.  While  this 
degradation  may  seem  significant,  the  average  difference  of  49.1,  distributed  over  the  1,188  angles  in  the  objective  function, 
equates  to  an  additional  0.20  dB  of  variation  per  angle.  This  extra  variation  is  insignificant  to  the  design,  especially  in  light  of 
the  expected  fabrication  tolerance  and  simulator  accuracy. 

However,  this  problem  was  fairly  easy,  so  a  more  difficult  problem  is  needed  to  show  whether  this  method  will  be  generally 
useful. 

3.4  The  14-wire  Yagi 

Using  the  tuned  parameters  of  99%  of  parents  clustered,  the  14-wire  Yagi  was  optimized  using  both  methods.  However,  the 
cluster  size  made  a  significant  difference  in  the  resulting  score  of  the  Yagi  antenna.  Several  runs  were  conducted  with  various 
cluster  threshold  values  as  shown  below. 


Population 

Overlap 

Cluster 

threshold 

Average 

objectives 

Average 

score 

Baseline 

200 

0.42 

- 

22299 

16.29 

Clustering 

200 

0.25 

0.53 

3549 

14.94 

200 

0.25 

0.26 

4930 

15.51 

200 

0.25 

0.053 

12898 

16.22 

Table  5.  Baseline  vs.  Clustering  GA  performance  for  the  14-wire  Yagi 


Note  that  the  largest  two  cluster  threshold  sizes  used  are  larger  than  in  the  folded  monopole,  to  account  for  the  larger  number 
of  dimensions.  However,  the  results  show  that  increasing  the  cluster  threshold  caused  significantly  poorer  scores. 

In  this  case,  a  one-point  difference  in  the  score  makes  a  big  difference  in  the  quality  of  the  design,  since  Yagi  antennas  are 
desired  to  be  as  well-matched  and  as  high-gain  as  possible.  A  drop  of  1  point  means  a  decrease  of  1  dB  of  gain  or  a  VSWR 
over  3.0.  Thus,  the  difference  in  score  between  the  baseline  and  the  clustering  method  using  a  cluster  size  of  0.53  was 
unacceptable.  The  search  space  was  too  difficult  to  search  with  a  local  optimizer  if  the  cluster  had  only  converged  to  that  size. 
However,  by  tightening  up  the  size  of  the  cluster,  the  clustering  method  was  able  to  essentially  match  the  baseline  score,  but 
in  about  58%  of  the  objective  calls! 

Incidentally,  the  gain  of  a  typical  Yagi  with  a  score  of  16.2  is  16.23  dB,  with  a  VSWR  of  1.06.  A  typical  Yagi  designed  using 
conventional  means  has  a  gain  of  15.9  and  a  VSWR  of  1.23  [6]. 

Thus,  there  is  a  tremendous  speed  advantage  to  using  this  method  for  this  and  the  previous  time-intensive  problems,  and  the 
price  in  design  performance  can  be  trivial  if  the  proper  settings  are  used. 

Conclusion 

In  general,  the  clustering  method  shows  significant,  even  remarkable,  time  savings  over  more  typical  methods  of  determining 
convergence.  The  time  saved  by  using  the  clustering  method  is  directly  proportional  to  the  decrease  in  the  number  of 
objective  function  calls  for  problems  with  any  time-consuming  simulations,  as  in  wire  antenna  design.  These  savings  can  be 
as  much  as  90%  without  significant  degradation  to  design  performance. 
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However,  the  ideal  settings  for  the  method  have  been  shown  to  be  problem  dependent,  though  the  trend  we  have  found  is  that 
the  more  the  population  is  converged  and  the  tighter  the  clusters  are  required  to  be,  the  less  design  performance  degrades. 
Naturally,  this  performance  is  achieved  at  the  expense  of  objective  function  calls.  While  good  starting  values  seem  to  be  99% 
of  the  population  clustered,  and  0.1  cluster  size  (given  a  range  of  0-1  for  all  genes),  the  best  settings  have  to  be  determined 
on  a  case-by-case  basis. 

While  the  results  presented  here  are  very  promising,  there  is  much  work  that  remains.  First,  an  adaptive  method  of  clustering 
that  does  not  depend  on  an  a  priori  setting  of  a  cluster  threshold  is  desired.  Speciation  techniques  like  mating  restriction  need 
to  be  tried  with  clustering  to  see  if  there  is  any  advantage  for  encouraging  early  cluster  formation  beyond  what  the  GA  does 
normally.  It  would  also  be  of  interest  to  apply  this  method  to  a  binary  GA. 

In  addition,  work  must  be  done  to  refine  the  local  optimizer,  perhaps  enhancing  its  ability  to  escape  from  small  “traps,” 
because  it  did  not  perform  as  well  as  expected,  given  that  the  cluster  methods  nearly  always  placed  the  local  optimizer 
starting  point  fairly  close  to  the  optimum  value.  A  “fully”  converged  GA  often  placed  the  local  optimizer  just  a  little  closer  to 
the  true  optimum — closer  enough  to  produce  a  statistically  significant  difference  in  design  performance  in  many  cases. 

In  summary,  then,  the  method  of  clustering  described  in  this  paper,  though  simple  and  relatively  unsophisticated,  shows 
tremendous  promise  at  enhancing  the  efficiency  of  a  GA.  It  showed  time  savings  in  every  case  it  was  applied,  with  the 
antenna  design  problems  showing  greater  efficiency  enhancement  for  less  degradation  in  fitness  than  the  test  functions.  This 
indicates  that  this  method  may  be  most  effective  for  the  problems  where  efficiency  is  most  needed:  large,  time-consuming 
problems  that  are  currently  very  difficult  or  even  intractable  using  standard  GA  optimization. 
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Abstract —  The  development  of  efficient  and  effective 
algorithms  for  sparse  matrix  bandwidth  minimization  is 
of  paramount  importance  for  the  enhancement  of  many 
numerical  techniques  for  the  analysis  of  microwave  cir¬ 
cuits.  The  task  of  bandwidth  reduction  is  computation¬ 
ally  hard.  Several  approaches  have  already  been  pro¬ 
posed,  but  the  problem  is  still  open. 

In  this  paper,  a  genetic  solution  is  proposed.  The  ge¬ 
netic  algorithm  is  described,  as  well  as  its  main  character¬ 
istics  (choice  of  chromosomes,  genetic  operations,  etc.). 
Results  demonstrate  that  the  advantages  of  the  genetic 
approach  vanish  because  of  the  huge  computational  ef¬ 
fort  required.  This  severe  limitation  is  removed  thanks 
to  the  natural  amenability  of  genetic  algorithms  to  a  par¬ 
allel  implementation.  Results  in  the  paper  prove  that  a 
parallel  genetic  approach  is  a  state-of-the-art  solution  to 
the  problem  of  bandwidth  reduction  of  sparse  matrices 
encountered  in  electromagnetic  numerical  methods. 

I.  Introduction 

The  use  of  numerical  methods  is  nowadays  the  most 
typical  way  to  approach  the  design  of  complex  mi¬ 
crowave  circuits  with  a  high  degree  of  accuracy,  with 
a  low  cost  and  a  substantial  reduction  of  times  for  trim¬ 
ming  and  tuning.  The  solution  of  a  linear  system  of 
equations 

Ax  —  B  (1) 

is  quite  often  the  computational  core  of  numerical  meth¬ 
ods  [1].  In  some  cases,  the  system  (1)  is  solved  many 
times,  with  different  right-hand-sides  B  and  the  same 
matrix  A,  and  generally  the  matrix  properties  affecting 
the  efficiency  of  the  solution  are 

•  its  pattern 

•  its  condition  number 

In  many  MW  applications,  both  items  have  a  pre¬ 
dictable  behaviour.  For  instance,  some  numerical  ap¬ 
proaches  typically  produce  sparse  matrices  (such  as  in 
the  case  of  Mode-matching  [1],  or  Finite  Element  Meth¬ 
ods  [2]),  with  a  distribution  of  non-zero  elements  which 
can  be  in  some  cases  predicted.  Other  approaches, 
such  as  the  discretization  with  the  Method  of  Moments 
(MoM)  of  mixed-potential  integral  equations  (MPIE)  for 
planar  circuits,  generate  impedance  matrices  which  can 
be  turned,  with  suitable  thresholding  actions  over  its  en¬ 
tries,  into  sparse  matrices  with  a  typical  blocked-banded 
pattern.  The  use  of  wavelet  expansions,  for  instance 
in  conjunction  with  a  MoM  discretization  of  the  solv¬ 
ing  equations,  can  improve  the  condition  number  (when 


orthogonal  wavelets  are  used)  and  increase  the  matrix 
sparsity. 

Several  efforts  have  been  produced  to  suitably  treat 
the  matrix  properties,  so  that  efficient  linear  algebra 
can  be  performed  inside  electromagnetic  (EM)  codes: 
the  use  of  appropriate  solvers  [3],  [4],  [5],  or  analyti¬ 
cal/numerical  approaches  for  reducing  the  filling-in  of 
the  moment  matrix  [6],  or  the  coupled  use  of  appropri¬ 
ate  solvers  with  high-performance  architectures  [7],  just 
to  mention  some  recent  works. 

It  has  been  demonstrated  [8]  that,  in  many  cases,  the 
most  robust  and  efficient  strategy  is  based  on  an  ap¬ 
propriate  numbering  of  the  problem’s  unknowns  ( x  in 
(1)),  so  that  the  system  is  reduced  to  a  banded  sys¬ 
tem  with  reduced  bandwidth.  This  allows  the  use  of  a 
banded  direct  factorize-and-solve  algorithm,  with  high 
efficiency  (its  complexity  depends  quadratically  on  the 
matrix  bandwidth  [9]). 

As  a  matter  of  fact,  the  efficiency  and  effectiveness  of 
algorithms  for  sparse  matrix  bandwidth  reduction  is  cru¬ 
cial  for  the  high-performance  analysis  of  MW  circuits. 
The  identification  of  an  optimum  permutation  matrix  P 
so  that 

{P  APt)(Px)  =  PB  (2) 

is  a  banded  system  with  minimum  bandwidth  is  an  NP- 
hard  task  [10],  and  amenable  for  a  possible  solution  with 
a  genetic  algorithm. 

In  this  paper,  we  propose  a  genetic  method  for  the 
reduction  of  bandwidth  of  sparse  matrices  attained  in 
different  MW  numerical  methods.  In  Section  II,  we  de¬ 
scribe  the  problem  and  its  general  issues.  In  Section  III 
we  describe  the  proposed  genetic  solution.  In  Section  IV 
we  compare  its  results  with  other  bandwidth  reducers. 
In  Section  V  we  briefly  discuss  a  parallel  version  of  the 
genetic  approach,  and  finally  draw  some  conclusions. 

II.  The  problem  of  bandwidth  reduction:  why 

USING  GENETIC  ALGORITHMS 

Referring  to  equation  (1),  the  problem  is  the  follow¬ 
ing:  consider  the  bandwidth  /3  of  the  A  matrix, 

/ 3  =  maxji  —  j\  Vz,j  |  ay  ^  0  (3) 

A  sparse  matrix  of  dimension  N  with  symmetrical  zero- 
non-zero-pattern  can  be  represented  by  a  graph,  as  in 
Fig.  1,  once  that  each  row/column  is  numbered.  A 
vector  n  =  {7Ti,7T2,  ...7Tjv}  is  a  possible  numbering,  and 
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is  represented  by  a  permutation  of  the  initial  numbering 
{1,2 The  solution  of  the  problem  is  represented 
by  an  optimum  II opt  so  that 

P(U opt)  =  min(P(U))  Vn  (4) 

In  case  of  non  symmetrical  zero-non-zero  pattern,  this 
graph  representation  has  some  troubles,  and  is,  as  far  as 
we  know  today,  substantially  useless. 

The  solutions  to  the  bandwidth  minimization  problem 
proposed  in  the  literature  till  now  can  be  divided  into 
two  main  classes: 

•  Solutions  based  on  a  graph  representation 

•  Alternative  solutions 

The  most  important  approach  based  on  graph  represen¬ 
tation  is  the  one  proposed  by  Cuthill  and  McKee  (CM) 
in  1969  [11].  They  proposed  some  efficient  heuristics 
to  identify  U0pt,  by  introducing:  1)  a  partitioning  of  the 
graph  into  levels  2)  new  vertices  at  a  maximum  distance 
3)  heuristical  rules  for  cutting  some  edges,  and  creating 
new  ones  (see  Fig.  1).  Several  upgrades  of  the  CM  ap¬ 
proach  have  been  proposed.  The  one  by  Gibbs,  Poole 
and  Stockmeyer  (GPS)  [12]  is  extremely  efficient,  even 
though  it  has  recently  been  overcome  by  the  one  by  Es¬ 
posito,  Malucelli  and  Tarricone  (EMT)  [8],  [13],  which 
has  been  defined  as  the  current  state-of-the-art  for  the 
bandwidth  minimization  of  matrices  generated  by  EM 
codes  [14]. 


(SA)  [15]  and  of  Tabu-Search  (TS)  [16].  In  both  cases, 
heuristical  laws  are  introduced,  in  conjunction  with  an 
appropriate  use  of  data  structures  to  take  into  account 
the  evolution  of  the  search,  so  that  the  risk  of  being 
trapped  into  local  optima  is  reduced. 

Despite  the  strong  efforts  performed  till  now,  several 
problems  are  still  open.  For  instance,  CM  and  GPS 
have  severe  troubles  with  some  pathological  cases  aris¬ 
ing  from  FEM  simulation  of  boxed  microstrip  lines,  or 
MM  analysis  of  rectangular  waveguide  circuits  [1],  [17]. 
Moreover,  they  cannot  cope  with  the  problem  of  non- 
symmetrical  structures  of  matrices  encountered,  for  in¬ 
stance,  in  some  cases  when  wavelet  expansions  are  used 
with  the  MoM  [4].  The  EMT  approach  has  solved  these 
problems,  but  its  performance  on  non-symmetrical  ma¬ 
trices  can  be  enhanced.  As  for  SA  and  TS  approaches, 
they  are  quite  appropriate  to  overcome  the  problem  of 
non-symmetrical  patterns,  but  their  numerical  weight  is 
still  too  much  to  make  their  use  appealing  in  routinely- 
used  CAD  tools. 

On  such  bases,  an  experimentation  of  a  genetic  ap¬ 
proach  (GA)  to  the  problem  is  quite  interesting.  In 
fact,  especially  for  large  matrices,  the  use  of  appropri¬ 
ate  global  search  strategies,  with  the  possibility  of  em¬ 
bedding  complex  heuristical  laws,  is  essential  for  find¬ 
ing  satisfactory  solutions.  Moreover,  a  GA  is  natu¬ 
rally  amenable  to  represent  non-symmetrical  problems, 
with  a  consequent  advantage  with  respect  to  graph  ap¬ 
proaches.  It  is  also  easier  to  implement  than  graph  ap¬ 
proaches.  Finally,  its  expectable  drawback,  i.e.  its  nu¬ 
merical  weight,  can  easily  be  circumvented  by  a  migra¬ 
tion  to  parallel  platforms  (GA  is  intrinsically  amenable 
to  a  parallel  design). 

III.  The  Genetic  Solution 

Genetic  algorithms  are  nowadays  commonly  used  in 
the  design  and  optimization  of  EM  circuits  [18].  We 
address  to  the  pioneeristic  works  of  Goldberg  [19]  and 
Holland  [20]  for  the  basic  concepts,  and  describe  here 
the  main  features  of  the  GA  proposed  here. 


Fig.  1.  A  sparse  matrix  with  symmetrical  zero- non-zero  pattern 
can  be  represented  by  a  graph,  once  rows/ columns  have  been 
numbered.  A  level  partitioning  can  be  identified  on  the  graph, 
once  two  vertices  VI  and  V2  have  been  selected.  A  permu¬ 
tation  or  renumbering  of  rows/columns  modifies  the  matrix 
pattern  and  the  graph  layout,  with  effects  on  the  matrix  band¬ 
width. 


The  alternative  approaches  proposed  till  now  are 
based  on  combinatorial  techniques  based  on  global  opti¬ 
mization.  Examples  are  the  use  of  simulated-annealing 


A.  Choice  of  chromosomes 

As  put  forwards  in  (4),  the  problem  unknown  is  a  vec¬ 
tor  of  natural  numbers  called  II opt.  Consequently,  it  is 
natural  to  define  chromosomes  as  strings  of  natural  num¬ 
bers,  of  the  same  dimension  of  A  matrix.  This  choice 
has  a  major  drawback.  In  fact,  during  the  usual  op¬ 
erations  over  chromosomes,  for  instance  when  perform¬ 
ing  cross-overs,  we  risk  the  generation  of  non-feasible 
chromosomes,  such  as  permutations  of  n  with  repeated 
numbers.  On  the  other  side,  cross-over,  as  quite  well- 
known,  is  of  fundamental  importance  for  the  efficiency 
and  effectiveness  of  the  GA.  Therefore,  in  order  to  avoid 
the  problems  of  repeated  numbers  after  crossing-over,  a 
set  of  data  structures,  and  dedicated  algorithms,  have 
been  designed.  The  data  structures  are:  1)  the  cur¬ 
rent  permutation  vector  II;  2)  an  auxiliary  vector  Aux 
initialized  with  a  certain  permutation  without  repeated 
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numbers;  3)  a  vector  NewU  with  the  generated  permu¬ 
tation.  It  must  be  stressed  that  NewU  can  host  permu¬ 
tations  with  repeated  natural  numbers.  The  dedicated 
algorithms  allow  the  generation  of  permutations  with 
repeated  numbers,  and  their  transformation  into  per¬ 
mutations  without  repeated  numbers,  so  that  a  biuni¬ 
vocal  correspondence  is  guaranteed  between  each  NewU 
instance  and  each  feasible  II  instance. 

Before  describing  the  algorithms,  we  introduce  a  func¬ 
tion  foundpos(n(l )),  which  finds  out  the  position  inside 
Aux  of  the  first  entry  11(1)  of  array  II.  For  instance,  if 
we  have  II  =  {3, 1,5,4, 2},  and  Aux  =  {1,2, 3, 4, 5}, 
foundpos (11(1))  =  foundpos( 3)  =  3.  We  also  introduce 
a  function  delete(arr(i )),  which  deletes  the  entry  i  from 
array  arr .  For  instance,  if  we  have  Aux  =  {1,2, 3, 4, 5}, 
delete(Aux( 3))  turns  Aux  into  {1, 2, 4, 5}  (its  dimension 
has  been  reduced  by  one). 

The  algorithm  for  generating  a  modified  permutation 
with  repeated  numbers  is  now  described.  The  joint  use 
of  NewU  and  Aux  data  structures  guarantees  a  biunivo¬ 
cal  correspondence  between  each  instance  of  NewU  and 
one  instance  of  II  (i.e.  a  permutation  vector  without 
repeated  numbers): 
for  i=l,N 

NewU(i)  =  foundpos(II(l))-l 
delete(Aux(NewU(i)+l)) 
delete(II(l)) 
end 

The  implementation  of  this  algorithms  results,  for  in¬ 
stance,  in  the  following  steps  for  a  given  current  permu¬ 
tation  and  auxiliary  permutation: 


n 

Aux 

NewU 

31542 

12345 

2 

1542 

245 

20 

42 

24 

202 

2 

2 

2021 

- 

- 

20210 

As  apparent,  the  final  NewU  vector  has  some  re¬ 
peated  numbers.  Its  use,  in  conjunction  with  Aux,  is 
sufficient  to  convert  it  into  the  corresponding  II.  The 
conversion  is  performed  by  simply  reverting  the  algo¬ 
rithm  to  generate  the  modified  permutation. 

B.  Initial  Population 

The  proposed  implementation  of  the  GA  has  been 
proved  to  be  nearly  unsensitive  to  the  chosen  starting 
population,  provided  that  its  cardinality  is  suitable  with 
respect  to  the  size  of  the  problem  (the  matrix  dimension 
N)- 

As  already  observed  for  different  combinatorial 
heuristics  [21],  no  deterministic  laws  have  been  deter¬ 
mined  to  describe  the  convergence  of  the  GA  with  re¬ 
spect  to  the  population  generation,  as  well  as  to  its  car¬ 
dinality.  In  the  current  implementation,  we  generate  a 
starting  population  by  random  extraction  of  permuta¬ 
tion  vectors  from  the  starting  choice  II  =  {1,2,  ....N}. 


C.  Cost  function 

The  choice  of  a  suitable  cost  function  is  of  paramount 
importance  for  the  convergence  of  a  combinatorial  opti¬ 
mization  task.  The  bandwidth  minimization  can  be  per¬ 
formed  with  different  choices  of  the  cost  function.  One 
of  the  most  important  issues  is  the  selection  of  a  cost 
function  so  that  as  few  different  solutions  II  as  possible 
have  equal  cost,  and  risk  to  be  considered  as  equivalent. 
For  instance,  the  very  trivial  choice  of  a  cost  function 

c(n)  =  m  (5) 

where  the  bandwidth  corresponding  to  a  certain  permu¬ 
tation  vector  is  the  cost,  is  not  satisfactory  at  all.  An 
enhancement  can  be  the  following  choice: 

c(U)  =  wrfiU)  +  w2N(3  (6) 

where  Np  is  the  number  of  rows/columns  that  have 
maximum  bandwidth  /?,  whilst  w\  and  w2  are  tunable 
weights.  Of  course,  in  case  of  unsymmetrical  patterns, 
the  same  function  can  be  transformed  into 

c(U)  =  ( Wil/3l(U)+W2lN/3l)  +  {w1uPu(U)+W2uNpU ) 

(7) 

where  subscripts  U  and  L  correspond  to  ”  upper”  and 
”  lower”  part  of  the  matrix  (with  respect  to  the  main 
diagonal).  The  three  proposed  choices  are  still  not  com¬ 
pletely  satisfactory:  even  in  the  case  of  (6)  or  (7)  there 
are  many  different  permutation  vectors  corresponding 
to  the  same  value  of  c(II). 

Some  new  ideas  have  been  proposed  in  [15],  and  sug¬ 
gest  the  following  solution  to  the  problem  of  a  suitable 
cost  function: 

=  (8) 

where  N  is  the  matrix  size,  and  F  is  the  following  func¬ 
tion: 

F( JV  U_ j\)  s-  /  N  |i— jj=0,  1 

'  ’ '  3\)  |  (JV  —  \i  —  jj)  •  (F{N7  \i  —  j\  —  1)  elsewhere 

(9) 

The  choice  of  (8)  guarantees  an  adequate  partitioning 
of  the  searching  space,  with  a  substantial  reduction  of 
the  risk  of  equivalence  among  different  permutations. 
This  is  the  cost  function  implemented  in  the  proposed 
GA. 

D.  Convergence  Criterion 

The  sparse  matrix  bandwidth  reduction  is  typically 
used  in  order  to  improve  the  solution  time  of  lin¬ 
ear  systems  by  using  banded  solvers,  which  have  a 
quadratic  complexity  with  respect  to  the  matrix  band¬ 
width.  Therefore,  it  is  possible  to  evaluate  the  effec¬ 
tiveness  of  each  iteration  by  comparing  the  time  needed 
for  a  single  iteration,  with  respect  to  the  induced  re¬ 
duction  of  the  solution  time.  This  practical  parameter, 
averaged  over  a  certain  number  of  iterations,  is  appro¬ 
priate  to  evaluate  when  the  bandwidth  reduction  should 
be  stopped. 
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E.  Genetic  Operators 

We  use  three  operators:  selection,  crossover  and  mu¬ 
tation. 


is  still  valid,  and  must  only  be  adjusted  to  cope 
with  the  problem  of  beginning  and  ending  point  of 
the  crossing  site. 


E.l  Selection 

We  adopt  the  most  typical  way  of  performing  selec¬ 
tion,  i.e.  on  a  cost-proportional  basis.  This  means  that 
Nsei  elements  of  the  population  are  randomly  chosen, 
and  the  one  with  the  lowest  cost  is  selected. 

E.2  Crossover 

The  basic  idea  is  of  generating  hybrid  chromosomes, 
by  crossing  together  two  selected  chromosomes.  This 
idea  is  here  coupled  with  another  empirical  observation: 
for  each  matrix  pattern,  some  rows/columns  are  more 
effective  than  others  when  performing  the  permutation. 
Therefore,  when  the  optimum  or  quasi-optimum  posi¬ 
tion  is  found  for  them,  the  corresponding  information 
should  be  preserved  in  the  permutation  vector.  The  nat¬ 
ural  translation  of  this  idea  is  the  principle  of  building- 
blocks,  further  described. 

Now  we  quickly  describe  when  and  how  crossover  is 
to  be  performed. 

•  When  crossover  is  to  be  performed:  this  is  decided 
following  a  probabilistic  approach  [22].  Two  vectors 
from  the  old  population  are  selected  in  accordance 
with  the  selection  operator.  One  random  number 
Pi  is  generated.  The  two  vectors  are  inserted  into 
the  new  population  if  p\  >  pc.  A  second  random 
number  p2  is  generated,  and  crossover  performed  if 
p2  >  Pc*  The  value  of  pc  is  a  heuristically  tunable 
parameter. 

•  How  crossover  is  performed:  two  random  numbers 
are  generated  to  identify  the  beginning  and  the  end 
of  the  crossing  site.  Two  new  chromosomes  are 
attained  by  exchanging  the  crossing  sites  between 
the  two  vectors.  For  instance,  if  we  indicate  with 
ni  and  ri2  the  two  random  numbers,  and  with  Ill 
and  n2  the  two  permutation  vectors,  the  entries 
ni(m,.  •  *  ,n2)  are  swapped  with  II2(ni,  •  •  •  ,n2). 

In  accordance  with  the  principle  of  preserving  build¬ 
ing  blocks  [23],  we  know  that  a  purely  random 
choice  of  the  crossing  site  is  often  unsatisfactory. 
Therefore,  by  using  some  statistical  data  about 
the  role  of  each  element  of  the  permutation  vec¬ 
tor  II  during  the  search,  some  positions  inside  the 
chromosome  are  prevented  from  destruction  during 
crossover.  The  protected  positions  typically  corre¬ 
spond  to  rows/columns  of  the  matrix  giving  a  low 
contribution  to  the  value  of  the  cost  function  (8). 
For  instance,  referring  to  the  previous  example,  if  a 
position  within  the  range  (m,  •  •  •  ,n2)  is  ranked  as 
a  building-block,  no  swapping  is  performed  on  it. 
Of  course,  when  performing  crossover,  the  data 
structures  Aux  and  Ne^II  must  be  suitably  man¬ 
aged,  so  that  the  modified  permutation  can  be 
turned  into  a  permutation  vector  II  without  rep¬ 
etitions.  The  algorithm  mentioned  in  Section  III. A 


E.3  Mutation 

Three  kinds  of  mutations  are  performed:  swap,  left 
and  right  shift.  One  tunable  parameter  pm  is  chosen, 
and  two  random  numbers  pos\  and  pos2  generated.  A 
new  random  number  is  generated.  If  it  is  larger  than  pm, 
genes  pos\  and  pos2  in  the  chromosome  are  swapped, 
and  a  left  and  right  shift  is  performed  over  the  partition 
of  vector  starting  at  posi  and  ending  at  pos2. 

When  mutation  is  performed,  the  principle  of  pre¬ 
serving  building  blocks  is  not  respected.  Moreover,  a 
distance- dependent  mutation  is  implemented  [24].  In 
fact,  it  is  well  known  that,  especially  when  small  popu¬ 
lations  of  chromosomes  are  used,  the  use  of  a  fixed  value 
of  pm  does  not  prevent  from  the  premature  convergence 
over  local  minima.  Therefore,  the  value  of  pm  is  dy¬ 
namically  adapted,  in  order  to  avoid  being  trapped  into 
unsatisfactory  solutions. 

IV.  Results  on  serial  platforms 

We  propose  two  types  of  results.  The  former  one 
refers  to  matrices  encountered  in  the  analysis  of  1)  rect¬ 
angular  waveguides  inhomogeneously  filled  with  dielec¬ 
tric  (Fig.  2)  or  2)  boxed  microstrip  lines  (Fig.  3).  A 
revisited  version  of  a  public-domain  FEM  code,  called 
EM  API,  based  on  a  variational  scalar  formulation  [25], 
is  used. 


Fig.  2.  A  rectangular  waveguide  inhomogeneously  filled  with 
dielectric.  Different  dielectrics  and  geometries  have  been  cho¬ 
sen.  One  of  the  examples  is  shown  in  the  figure. 

The  latter  refers  to  matrices  generated  during  the 
analysis  of  microstrip  circuits  with  an  MPIE-MoM  for¬ 
mulation  [26].  In  all  the  proposed  cases,  the  perfor¬ 
mance  of  the  GA  is  compared  with  a  commercial  CM 
approach  available  in  MATLAB,  a  GPS  and  TS  solu¬ 
tion  implemented  by  the  author,  and  with  the  previously 
mentioned  EMT  solution  described  in  [8],  [13]. 

A.  FEM  Analysis 

Table  I  proposes  results  for  problems  such  as  the  one 
in  Fig.  2.  A  standard  WR90  is  studied  in  the  range  8- 
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Fig.  3.  A  boxed  microstrip  line.  Different  cases  with  different 
dimensions  and  dielectric  layers  have  been  simulated. 


12GHz,  and  the  electric  field  distribution  evaluated  with 
different  dielectric  fillings. 


N 

In.  p 

GPS 

CM 

EMT 

GA 

284 

92 

115 

74 

62 

62 

62 

374 

107 

122 

102 

72 

106 

96 

639 

151 

178 

172 

87 

102 

91 

1231 

251 

247 

242 

199 

233 

212 

Table  I:  Results  for  different  matrices  generated  during 
a  FEM  analysis  of  inhomogeneously  filled  rectangular 
waveguides.  Matrix  size  N,  initial  bandwidth  /?,  and 
final  bandwidth  attained  with  different  approaches  are 
reported. 

As  apparent  from  Tab.  I,  GPS  and  CM  have  a  critical 
behaviour  with  some  pathological  cases.  The  EMT  ap¬ 
proach  is  the  more  robust,  even  though  the  GA  is  quite 
effective  as  well.  An  essential  issue  is  the  time  needed 
to  achieve  the  solution.  It  is  reported  in  Tab.II,  on  a 
Pentium  166MHz: 


N 

GPS 

CM 

EMT 

TS 

GA 

284 

0.218 

0.22 

0.215 

6.9 

7.2 

374 

0.74 

0.87 

0.560 

19.1 

18.9 

639 

2.4 

3.2 

1.44 

498 

480 

1231 

18.8 

16 

3.74 

g.t.  10000 

g.t.  10000 

Table  II:  Times  to  find  out  the  optimum  II  for  the 
cases  in  Tab.  I. 


Tab.  II  clearly  demonstrates  the  real  limitation  of 
the  GA:  it  is  quite  effective,  but  too  computationally 
heavy.  For  instance,  if  we  consider  that  the  FEM  gen¬ 
erates  banded  matrices,  we  can  compare  the  standard 
use  of  banded  direct  solver  (BDS)  without  bandwidth 
reduction  (i.e.  what  EM  API  routinely  does),  with  the 
case  of  a  banded  direct  solver  (BDS)  used  after  band¬ 
width  reduction.  The  time  (in  seconds)  needed  for  a  100 
frequency-point  analysis  is  reported  in  Tab.  Ill: 


N 

EMAP1 

EMT+DBS 

GA+DBS 

374 

264.8 

186.1 

242 

639 

798.4 

395.2 

961 

1231 

12270 

1376 

g.t.  30000 

Table  III:  Times  in  seconds  to  analize  at  100  frequency 
points  some  circuits  with  the  FEM-code  EM  API,  with 
respect  to  the  use  of  bandwidth  reduction  in 
conjunction  with  a  direct  banded  solver  (DBS). 

It  is  easily  seen  that  when  the  problem  dimension 
grows  up  the  numerical  complexity  of  the  G  A  becomes  a 
substantial  limitation,  whilst  the  EMT  approach  is  quite 
advantageous.  Similar  results  are  attained  in  the  case 
of  circuits  such  as  the  one  in  Fig.  3.  Table  IV  reports 
some  results,  with  the  same  scheme  of  Tab.  Ill: 


N 

EM  API 

EMT+DBS 

GA+DBS 

484 

284.8 

24.6 

212.1 

720 

737.5 

162.7 

13211 

Table  IV:  Times  in  seconds  to  analize  at  100  frequency 
points  some  circuits  with  the  FEM-code  EM  API,  with 
respect  to  the  use  of  bandwidth  reduction  in 
conjunction  with  a  direct  banded  solver  (DBS). 

The  matrices  generated  in  the  case  of  boxed  mi¬ 
crostrip  lines  have  a  smaller  bandwidth  with  respect  to 
the  case  of  inhomogeneously  waveguides,  and  this  ex¬ 
plains  the  reduced  simulation  times. 


B.  MPIE/MoM  Analysis 

We  refer  to  a  MPIE  formulation  using  closed-form 
spatial-domain  Green’s  functions,  discretized  with  a 
Galerkin  MoM  with  roof-top  functions.  As  described 
in  [26],  the  analysis  of  microstrip  circuits  with  this  ap¬ 
proach  originally  generates  dense  impedance  matrices; 
anyway,  a  thresholding  action  can  be  performed  over  the 
matrix,  so  that  all  entries  smaller  than  a  certain  value 
are  zeroed.  This  can  imply  a  very  small  approximation 
error  (around  1%)  provided  that  a  suitable  threshold  is 
identified.  In  the  large  majority  of  cases,  a  value  of  10”6 
with  respect  to  the  largest  entry  in  the  matrix  is  appro¬ 
priate,  and  a  matrix  sparsity  between  70  and  85  %  is 
achieved. 

Referring  to  the  circuits  of  Fig.  4,  we  report  results 
in  Tab  V,  where  we  compare  times  for  the  analysis  of 
the  circuit  by  using  an  iterative  sparse  solver  (ISS),  with 
respect  to  the  use  of  different  bandwidth  reduction  ap¬ 
proaches  in  conjunction  with  DBS.  Both  the  ISS  and 
the  DBS  come  from  the  same  public  domain  library  (La- 
pack).  A  dispersion  curve  of  100  frequency  point  is  eval¬ 
uated  for  both  circuits.  The  single-stub  circuit  operates 
in  the  range  7.5-12  GHz,  the  double  stub  between  8  and 
18  GHz. 
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N 

ISS 

EMT+DBS 

GA+DBS 

280 

113.4 

23.6 

57.1 

448 

312.5 

84.1 

412 

Table  V:  Times  in  seconds  to  analize  at  100  frequency 
points  the  two  circuits  in  Fig.  4  with  the  MPIE/MoM 
with  ISS,  with  respect  to  the  use  of  MPIE/MoM  with 
bandwidth  reduction  in  conjunction  with  a  DBS. 

Also  in  this  case,  it  is  apparent  that  the  performance 
of  the  GA  is  less  attractive  than  the  EMT’s  one,  and, 
above  all,  it  decreases  when  enlarging  the  size  of  the 
problem. 


Fig.  4.  The  two  circuits  analized  with  the  MPIE/MoM.  For 
the  single  stub  er  —  10.65,  d=1.27mm,  wl=w2= 1.44mm, 
L=17.28mm,  Ls=2.16mm.  For  the  double  stub  £r  =  9.9, 
d=10mm,  wl~9.2mm,  w2=23mm,  L=110.6mm. 

V.  Parallel  GA  Solution 

The  recent  progresses  in  parallel  computing,  and 
above  all  the  development  of  low-cost  and  efficient  par¬ 
allel  platforms,  such  as  clusters  of  PCs,  can  change  the 
perspective  opened  by  the  previous  observations.  As 
apparent  in  previous  sections,  the  several  advantages  of 
the  GA,  i.e.  its  easy  implementation,  its  amenability  to 
cope  with  pathological  cases,  as  well  as  to  deal  with  non- 
symmetrical  or  unstructured  patterns,  are  ineffectual, 
due  to  its  large  numerical  complexity.  Luckily,  the  na¬ 
ture  of  GA  renders  it  intrinsically  amenable  to  a  parallel 
design.  The  large  majority  of  tasks  inside  it,  such  as  the 
generation  of  a  farm  of  initial  populations  and  the  evo¬ 
lution  of  each  population,  can  be  performed  in  parallel 
on  different  processors.  The  percentage  of  potentially- 
parallel  tasks,  with  respect  to  the  overall  serial  work, 
ranges  between  80  and  95  %,  depending  on  the  problem 
size  (II  dimension)  and  the  selection  of  some  heuristical 
parameters,  such  as  pm  and  pc. 
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Therefore,  a  parallel  version  of  the  GA,  called  PGA, 
has  been  implemented  using  Parallel  Virtual  Machine 
(PVM)  programming  interface,  on  an  IBM  SP2  with 
8  nodes.  The  PGA  performs  a  parallel  generation  of 
a  farm  of  initial  populations,  and  periodically  collects 
the  results  of  the  evolutionary  search  from  each  popu¬ 
lation,  so  that  cross-over  and  mutations  are  performed 
over  chromosomes  from  different  populations,  with  an 
increase  of  the  level  of  hybridization.  This  can  be  de¬ 
scribed  as  a  first  coarse  level  of  parallelism.  A  second 
fine  level  of  parallelism  is  represented  by  the  evaluation 
of  the  cost  function,  which  is  performed  in  parallel.  This 
task  is  quite  heavy,  especially  when  large  problems  are 
attacked,  and  can  be  performed  in  parallel  with  a  suit¬ 
able  block-decomposition  of  both  the  matrix  and  the 
permutation  vector  II. 

A.  Results  with  PGA 

In  Tab.  VI  results  of  PGA  for  the  matrices  encoun¬ 
tered  in  the  FEM  analysis  are  reported  (see  Tab.  I). 
The  achieved  bandwidth,  and  the  computing  time  when 
using  8  SP2  nodes,  are  reported. 


N 

In.  /3 

Opt.  /3 

Time  (s) 

284 

92 

54 

1.4 

374 

107 

66 

2.5 

639 

151 

74 

54 

1231 

251 

151 

1123 

Table  VI:  Results  for  PGA  on  matrices  from  FEM 
analysis  of  MW  circuits.  Matrix  size  N,  initial 
bandwidth  /3,  and  final  bandwidth  are  reported. 

As  shown  in  Tab.  VI,  computing  times  are  reduced, 
and  the  effectiveness  of  bandwidth  reduction  is  im¬ 
proved.  The  use  of  PGA  results  in  the  times  reported 
in  Tab.  VII  for  a  100-frequency-point  dispersion  curve 
of  circuits  as  in  Fig.  2  (compare  with  Tab.  Ill): 


N 

EMAP1 

EMT+DBS 

PGA+DBS 

374 

264.8 

186.1 

193.1 

639 

798.4 

395.2 

422.7 

1231 

12270 

1376 

1642 

Table  VII:  Times  in  seconds  to  analize  at  100  frequency 
points  some  circuits  with  the  FEM-code  EM  API,  with 
respect  to  the  use  of  parallel  bandwidth  reduction  in 
conjunction  with  a  direct  banded  solver  (DBS). 

As  demonstrated  in  Tab.  VII,  the  performance 
of  PGA  turns  the  genetic  approach  into  an  effective 
method  to  reduce  the  time  for  the  numerical  analysis 
of  MW  circuits,  thanks  to  the  substantial  decrease  of 
bandwidth  reduction  time,  as  well  as  to  the  improve¬ 
ment  in  the  effectiveness  of  the  search.  PGA’s  efficiency 
is  similar  to  the  state-of-the-art  EMT’s  one.  Speed-ups 
in  the  simulation  times  up  to  a  factor  8  have  been  ob¬ 
served. 
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VI.  Conclusions 

A  genetic  solution  (GA)  to  the  problem  of  sparse  ma¬ 
trix  bandwidth  minimization  has  been  proposed.  The 
main  characteristics  of  the  approach  have  been  de¬ 
scribed,  with  respect  to  the  choice  of  chromosomes,  ge¬ 
netic  operators,  and  other  heuristical  parameters.  A 
suite  of  functions  has  been  developed  so  that  the  cross¬ 
over  can  be  performed  without  risks  of  non-feasible  chro¬ 
mosome  generation.  Results  have  proved  that  the  GA, 
despite  of  its  several  attractive  features  (simplicity,  flex¬ 
ibility,  amenability  to  global  optimization),  is  not  effi¬ 
cient  enough  to  be  considered  as  an  appropriate  tool  for 
CAD  environments  of  MW  circuits.  Thanks  to  its  natu¬ 
ral  parallelism,  the  approach  has  been  migrated  towards 
parallel  platforms  (PGA),  with  a  substantial  increase  in 
its  efficency  and  effectiveness,  which  are  similar  to  those 
of  state-of-the-art  bandwidth  reducers  based  on  graph 
theory  (EMT). 

On  the  other  side,  the  GA  and  PGA  are  rather  simple 
to  be  implemented,  whilst  EMT  is  complex  and  deserves 
a  deep  knowledge  of  graph  theory.  Furthermore,  it  is 
reasonable  to  expect  a  substantial  increase  in  the  scala¬ 
bility  and  efficiency  of  clusters  of  PCs  in  the  next  future, 
thanks  to  the  continuous  evolution  of  switch  and  fast- 
ethernet  technologies.  As  a  matter  of  fact,  with  very 
affordable  costs,  parallel  environments  for  the  analysis 
of  EM  circuits  can  be  predicted  as  the  natural  future 
infrastructure  for  MW  CAD  of  large  and  complex  cir¬ 
cuits.  In  conclusions,  the  opening  of  such  new  perspec¬ 
tives  turns  the  genetic  approach  into  a  candidate  solu¬ 
tion  to  improve  the  efficiency  of  numerical  methods  for 
EM  circuits  via  sparse  matrix  bandwidth  reduction. 
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abstract  The  population  size  and  mutation  rate  of  a 
genetic  algorithm  have  great  influence  upon  the  speed 
of  convergence.  Most  genetic  algorithm  enthusiasts  use 
a  large  population  size  and  low  mutation  rate  due  to  the 
recommendations  of  several  early  studies .  These  studies 
were  somewhat  limited.  This  paper  presents  results  that 
show  a  small  population  size  and  high  mutation  rate 
are  actually  better  for  many  problems. 

I.  Parameter  Selection  for  a  Simple  Genetic 
Algorithm 

Applications  of  a  genetic  algorithm  (GA)  to  the 
optimization  of  electromagnetics  problems  started  in 
the  early  1990s  [1],  [2]  and  have  exploded  since  then. 
The  optimization  of  array  patterns  using  a  GA  is 
particularly  attractive  for  the  synthesis  of  patterns  that 
have  desirable  characteristics.  Most  of  the  work  has 
followed  traditional  GA  philosophy  when  choosing  the 
population  size  and  mutation  rate  of  the  GA:  a 
relatively  large  population  and  a  low  mutation  rate  is 
used.  The  choice  of  population  size  and  mutation  rate 
can  vary  the  run  time  of  the  GA  by  several  orders  of 
magnitude. 

The  first  intensive  study  of  GA  parameters  was  done  by 
De  Jong  [3]  and  is  nicely  summarized  in  Goldberg  [4]. 
De  Jong  looked  at  both  on-line  and  off-line 
performance  of  the  GAs.  On-line  performance  is  an 
average  of  all  costs  up  to  the  present  generation.  Off¬ 
line  performance  is  the  best  cost  found  up  to  the  present 
generation.  He  tested  five  algorithms  of  varying  levels 
of  complexity  on  five  different  cost  functions  while 
varying  mutation  rate,  population  size,  crossover  rate, 
and  the  generation.  De  Jong  found  that  a  small 
population  size  improved  initial  performance  while 
large  population  size  improved  long-term  performance. 
A  higher  mutation  rate  was  good  for  off-line 
performance  while  low  mutation  rate  was  best  for  on¬ 
line  performance.  The  highest  mutation  rate  used  was 
0.1. 

Grefenstette  [5]  used  a  meta  GA  to  optimize  the  on-line 
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and  off-line  performance  of  GAs  based  on  varying  six 
algorithm  parameters:  population  size,  crossover  rate, 
mutation  rate,  scaling  window,  and  whether  or  not 
elitism  was  used.  A  cost  function  evaluation  for  the 
meta  GA  consisted  of  a  GA  running  until  5000  cost 
function  evaluations  were  performed  on  one  of  the  De 
Jong  test  functions  and  normalizing  the  result  relative  to 
that  of  a  random  search  algorithm.  Each  GA  in  the 
population  evaluated  each  of  the  De  Jong  test  functions. 
The  second  step  in  this  experiment  took  the  20  best 
GAs  found  by  the  meta  GA  and  let  them  tackle  each  of 
the  five  test  functions  for  five  independent  runs.  The 
best  GA  for  on-line  performance  had  a  population  size 
of  30  and  mutation  rate  of  0.01.  The  best  off-line  GA 
had  a  population  size  of  80  and  mutation  rate  of  0.01. 
He  concluded  that  good  results  could  be  obtained  with  a 
wide  selection  of  GA  parameters. 

Schaffer,  et.  al.  reported  results  on  optimum  parameter 
settings  for  a  binary  GA  using  a  Gray  code  [6].  This 
approach  added  five  more  cost  functions  to  the  De  Jong 
test  function  suite.  They  had  discrete  sets  of  parameter 
values  (population  size=10,  20,  30,  50,  100,  and  200; 
mutation  rate  =  0.001,  0.002,  0.005,  0.01,  0.02,  0.05, 
and  0.10;  crossover  rate  =  0.05  to  0.95  in  increments  of 
0.10;  and  1  or  2  crossover  points)  that  had  a  total  of 
8400  possible  combinations.  Each  of  the  8400 
combinations  was  run  with  each  of  the  test  functions. 
They  averaged  the  results  over  10  independent  runs. 
The  GA  terminated  after  10,000  function  evaluations. 
The  best  on-line  performance  resulted  for  the  following 
parameter  settings:  population  size  =20  to  30  (relatively 
small),  crossover  rate  =  0.75  to  0.95,  mutation  rate  = 
0.005  to  0.01  (the  highest  rates  tested),  and  two  point 
crossover. 

Thomas  Back  [7,  8,  9]  has  done  more  recent  analyses  of 
mutation  rate.  He  showed  that  for  the  simple  counting 
problem,  the  optimal  mutation  rate  is  1/C  where  { is  the 
length  of  the  chromosome  [7].  He  later  showed  that  an 
even  quicker  convergence  can  be  obtained  by  beginning 
with  even  larger  mutation  rates  (on  the  order  of  Vi)  and 
letting  it  gradually  adapt  to  the  l/£  value  [8].  In  later 
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work  [9],  he  compared  this  evolutionary  GA  approach 
with  evolutionary  strategies  and  showed  that  this 
adaptation  is  similar  to  the  self-adaptation  of 
parameters  that  characterizes  evolutionary  strategies 
approaches. 

Gao  [10]  computed  a  theoretical  upper  bound  on 
convergence  rates  in  terms  of  population  size,  encoding 
length,  and  mutation  probability  in  the  context  of 
Markhov  Chain  models  for  a  canonical  GA.  His 
resulting  theorem  showed  that  the  larger  the  probability 
of  mutation  and  the  smaller  the  population,  the  faster 
the  GA  should  converge.  However,  he  discounted  these 
results  as  not  viable  for  long-term  performance. 

Most  of  these  previous  studies  were  done  with  binary 
GAs.  More  engineers  are  discovering  the  benefits  of 
using  real  parameter  GAs,  namely  that  a  continuous 
spectrum  of  parameters  can  be  represented.  Our 
previous  work  with  real  GAs  [11]  devised  a  simple 
check  to  determine  the  best  population  size.  The  GA 
optimized  several  functions,  and  the  results  were 
averaged  over  100  independent  runs.  The  population 
size  times  the  number  of  iterations  (i.e.,  the  total 
number  of  chromosomes  evaluated)  was  kept  constant. 
The  "goodness”  of  the  algorithm  was  judged  by  the 
minimum  cost  found.  For  both  binary  and  continuous 
parameter  GAs,  a  small  population  size  allowed  to 
evolve  for  many  generations  produced  the  best  results. 
Similar  sensitivity  analyses  with  a  wider  range  of 
mutation  rates  suggested  that  mutation  rates  in  the 
range  of  0.05  to  0.35  found  the  lowest  minima. 

A  quick  search  of  web  sites  on  GAs  also  show 
conflicting  evidence  for  the  best  parameters  to  use. 
Some  sites  [12,  13]  suggest  that  GA  performance  may 
be  improved  for  smaller  population  sizes  and  higher 
mutation  rates.  In  addition,  enough  of  our  colleagues 
and  students  have  found  similar  results  for  their  GA 
problems  that  we  decided  further  study  is  necessary. 

These  previous  studies  have  shown  that  parameter 
settings  are  sensitive  to  the  cost  functions,  options  in 
the  GA,  bounds  on  the  parameters,  and  performance 
indicators,  which  must  be  carefully  considered.  In 
addition,  the  optimum  parameters  seem  to  depend  on 
whether  the  GA  is  just  beginning  its  descent  or  whether 
it  has  advanced  into  the  fine-tuning  of  the  solution 
stage.  Consequently,  different  studies  result  in  different 
conclusions  about  the  optimum  parameter  values 
depending  on  the  problem  and  the  parameters  explored. 
Davis  recognized  this  issue  [14]  and  outlined  a  method 
of  adapting  the  parameter  settings  during  a  run  of  a  GA 
[15].  He  does  this  by  including  operator  performance  in 
the  cost.  Operator  performance  is  the  cost  reduction 
caused  by  the  operator  divided  by  the  number  of 


children  created  by  the  operator.  Yet  most  GA 
practitioners  still  stick  to  large  population  sizes  and 
very  low  mutation  rates. 

This  paper  extends  the  work  in  [11]  from  the 
optimization  of  contrived  mathematical  functions  to  the 
optimization  of  array  factors.  The  goal  is  to  help  users 
of  GAs  select  appropriate  population  sizes  and  mutation 
rates  in  order  for  their  GAs  to  find  the  best  answer  as 
quickly  as  possible.  Thus,  emphasis  is  placed  on  off¬ 
line  performance  since  we  only  care  about  the  closeness 
of  the  final  answer  to  the  actual  answer  and  not  all  the 
extraneous  solutions  included  in  the  averaging  of  the 
on-line  indicator.  This  paper  reports  the  results  of 
experiments  to  determine  the  optimum  population  size 
and  mutation  rate  for  a  simple  real  GA  on  the  types  of 
problems  that  might  be  typical  in  electrical  engineering. 
Since  we  want  to  minimize  the  run  time  of  the  GA,  the 
criteria  for  judging  the  “goodness”  of  the  results  is  the 
number  of  calls  to  the  objective  function  required  for 
solution.  This  is  the  metric  that  determines  computer 
wall  clock  time  to  complete  the  solution.  In  addition, 
we  choose  to  count  function  calls  to  the  cost  function  as 
the  criteria  for  how  well  the  GA  is  performing.  This 
choice  is  more  in  keeping  with  the  usual  engineering 
requirement  to  minimize  run  time.  The  parameters  that 
produce  the  minimum  number  of  function  calls  to 
produce  an  acceptable  solution  are  deemed  the  “best.” 
A  solution  is  “acceptable”  when  a  predetermined  value 
close  to  the  minimum  is  found.  This  definition  is 
consistent  with  finding  the  deepest  well,  then  diving  to 
the  bottom  with  a  fast  local  optimizer.  Determining  the 
optimum  population  size  and  mutation  rate  must  take 
into  account  the  random  components  of  the  GA. 
Therefore,  we  average  over  a  large  number  of  runs  of 
our  GA  before  choosing  the  best  parameters. 

The  GA  used  here  is  termed  a  real  GA  because  the 
variables  to  be  optimized  are  continuous  and  are  not 
converted  to  binary  values.  Figure  1  shows  a  flow  chart 
of  a  simple  real  GA.  In  each  block  of  the  flow  chart, 
choices  must  be  made  on  how  to  perform  the  GA 
operations  in  that  block.  The  GA  in  this  paper  uses  a 
roulette  wheel  proportional  weighting  selection  and  the 
single  point  crossover  using  the  method  described  in 
[11].  Elitism  is  used.  These  are  common  choices  used 
in  practice  and  are  constants  for  this  particular  study. 

The  results  of  this  investigation  show  that,  for  the 
problems  solved  here,  small  population  size  and 
relatively  large  mutation  rate  are  far  superior  to  the 
large  population  sizes  and  low  mutation  rates  that  are 
used  by  most  of  the  papers  presented  in  the 
electromagnetics  community  and  by  the  GA  community 
at  large.  Such  results  suggest  that  future  research 
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consider  carefully  what  parameters  are  appropriate  to 
the  particular  problem. 


^  No 
Stop 


Figure  1.  Flow  chart  of  the  real  GA.  Finding  the 
optimum  mutation  rate  and  population  size  would 
cause  the  GA  to  find  an  acceptable  solution  faster. 

II.  A  Simple  Undulating  Function 

The  first  example  is  a  highly  undulating  function  with 
many  local  minima.  This  function  is 

/(jc,y)  =  jcsin(jc)  +  l.lsin(y)  for  0<x,y  <10  (1) 

Figure  2  shows  a  graph  of  this  function.  The  global 
minimum  over  the  specified  range  is  -18.5547  at  (x,y)  = 
(9.0390,8.6682). 


Figure  2.  Plot  of  the  first  function  minimized  by  the 
GA. 

Doing  single  runs  of  a  GA  for  different  sets  of 
population  sizes  and  mutation  rates  doesn’t  yield 
sufficient  information  due  to  the  statistical  nature  of  the 
GA.  To  dampen  the  effects  of  the  random  processes, 
results  are  averaged  over  many  runs  for  each  set  of 
parameters.  Thus,  the  GA  is  run  for  one  set  of 
parameters  until  the  solution  is  found.  After  performing 
T  independent  runs,  the  results  for  the  T  trials  are 
averaged. 

We  posed  the  problem  to  minimize  the  function  with 
the  fewest  number  of  function  evaluations.  Many 
engineering  and  scientific  applications  require  the 
evaluation  of  very  complex  fitness  functions.  These 
function  evaluations  drive  the  time  needed  for  the  GA 
to  converge.  Therefore,  our  criteria  for  how  “good”  a 
GA  run  performs  are  a  count  of  the  number  of  calls  to 
the  cost  function.  A  function  evaluation  is  necessary  for 
each  new  offspring  (mutated  or  not)  plus  each  mutated 
member  of  the  old  population.  If  a  new  offspring  is 
selected  to  be  mutated  3  times,  then  only  one  function 
evaluation  is  done.  Otherwise,  a  high  mutation  rate 
would  force  a  large  number  of  unnecessary  function 
evaluations. 

One  problem  with  a  GA  is  determining  when  the 
"correct"  answer  is  found.  We  addressed  this  issue  in 
two  ways  for  the  function  in  (1).  The  first  method  used 
-18.5  as  stopping  criteria.  Figure  3  shows  the  number 
of  function  calls  vs.  the  number  of  GA  runs  averaged 
for  a  stopping  point  of  -18.5.  Oscillations  occur  until 
the  GA  is  averaged  about  150  times.  For  these  criteria, 
we  would  not  consider  the  average  to  be  stable  until 
about  150  runs  have  been  averaged. 

The  second  method  of  defining  the  "correct"  solution 
was  less  rigorous  but  probably  more  realistic.  The 
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second  lowest  minimum  for  is  -16.9847  and  occurs  at 
(x,y)  =  (7.4697,8.6681).  Thus,  if  we  obtain  a  value  less 
than  this  local  minimum,  we  are  assured  that  we  have 
found  the  valley  of  the  global  minimum.  From  there, 
we  could  use  the  solution  as  a  first  guess  for  a  local 
optimizer  that  would  quickly  converge  on  the  actual 
minimum  point.  Since  this  two-step  process  is  often 
applied  in  practice,  we  stop  the  function  when  the  cost 
is  less  than  -17.  Figure  4  shows  the  number  of  function 
calls  vs.  the  number  of  GA  runs  averaged  for  a  stopping 
point  of  -17.  These  results  indicate  that  averaging  as 
few  as  40  or  50  runs  would  give  a  reasonably  consistent 
average.  Note  that  using  -17  as  the  stopping  point 
resulted  in  about  lA  of  the  runs  needed  for  averaging 
than  using  -18.5  as  the  stopping  point. 


Figure  3.  These  plots  show  both  the  average  and  the 
standard  deviation  of  the  number  of  function 
evaluations  when  the  GA  was  stopped  for  a  fitness 
that  was  less  than  -18.5. 


Figure  4.  These  plots  show  both  the  average  and  the 
standard  deviation  of  the  number  of  function 
evaluations  when  the  GA  was  stopped  for  a  fitness 
that  was  less  than  -17. 


Now  that  we  have  determined  the  number  of  runs 
needed  to  average  the  GA  to  find  the  optimum 
parameter  set,  the  GA  with  stopping  criteria  of  -17  is 
averaged  over  40  runs  with  mutation  rates  and 
population  sizes  of: 

mutation  rate:  .01  to  .49  in  increments  of  .02 

population  size:  4,  12,  20,  28,  36,  44,  52,  60 

We  analyze  the  number  of  cost  function  evaluations 
required  to  converge  for  three  different  cost  functions. 
Figure  5  shows  the  number  of  function  calls  required  to 
find  a  point  lower  than  -17.  Very  low  mutation  rates 
result  in  a  huge  number  of  function  calls.  Small 
population  sizes  seem  to  generally  require  fewer 
function  calls  than  larger  ones.  The  results  indicate  that 
a  GA  with  a  small  population  size  (<16)  and  a  mutation 
rate  between  .15  and  .5  works  best. 


averaged  over  40  runs 


Figure  5.  The  mean  number  of  function  calls  are 
plotted  vs.  the  mutation  rate  and  population  size 
when  the  GA  is  averaged  over  40  runs. 

III.  Optimizing  Side  Lobe  Tapers  - 
Example  1 

It  is  well  known  that  a  low  sidelobe  taper  can  be 
analytically  found  using  a  variety  of  methods  including 
the  Dolph-Chebychev  and  Binomial  distributions.  The 
point  here  is  to  just  use  a  test  case  for  the  GA  where  we 
know  the  best  solution  -  a  binomial  array.  In  fact,  local 
optimizers  provide  excellent  solutions  for  this  problem 
as  well.  The  authors  are  not  advocating  that  an  antenna 
designer  should  use  a  GA  to  find  an  amplitude  taper  for 
an  array.  There  are  many  other  much  better  techniques. 

Problem  Formulation 

The  goal  of  the  optimization  is  to  find  the  weighting  for 
a  linear  array  that  produces  the  minimum  maximum 
sidelobe  level.  The  objective  function  is  given  by 
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where 

N  =  number  of  array  elements 
a  =  vector  of  amplitude  weights 
p  =  vector  of  phase  weights 
k  =  27t/wavelength 
d  =  element  spacing 
u  =  angle  variable 

In  the  cases  presented,  only  a  or  p  are  optimized  but  not 
both  in  the  same  GA  run.  Thus,  the  number  of 
parameters  to  be  optimized  is  the  length  of  the  a  or  p 
vector.  The  array  factor  is  calculated  from  broadside  to 
endfire,  and  a  search  is  performed  to  find  all  the  peaks. 
The  highest  peak  (outside  the  main  beam)  is  returned  as 
the  cost  of  the  function  call.  Most  of  the 
electromagnetics  community  use  elitism  and  off-line 
performance  for  the  various  applications  reported. 
These  assumptions  are  used  but  not  tested  here. 

The  GA  was  run  500  times  to  find  the  minimum 
maximum  sidelobe  level  for  a  29  element  array.  Figure 
6  shows  three  independent  plots  of  the  average  number 
of  function  calls  to  reduce  die  sidelobe  level  below  -25 
dB  vs.  the  number  of  GA  runs  included  in  the  average. 
The  lines  become  very  close  when  the  number  of  runs 
exceeds  250.  That’s  a  lot  of  averaging.  Figure  7  shows 
the  previous  plot  enlarged  in  the  region  of  1  to  25 
averages.  This  region  clearly  shows  that  averaging  the 
runs  is  critical  to  making  valid  interpretations  of  the 
data.  When  averaging  is  used,  the  number  of  function 
calls  varies  within  a  range  of  500  for  the  three  trials.  At 
ten  runs  in  the  average,  the  number  of  function  calls 
varies  by  90  and  at  20  the  variation  is  down  to  76. 
Averaging  more  than  100  runs  adds  a  high  level  of 
confidence  in  any  conclusions  made  concerning  the 
optimum  population  size  and  mutation  rate. 

Results 

The  GA  is  first  used  to  find  the  optimum 
amplitude  taper  for  an  18  element  uniformly  spaced 
array  (d  ='  0.5  wavelengths).  The  taper  is  symmetric 
about  the  center  of  the  array  and  the  two  center 
elements  have  an  amplitude  of  one.  Whenever  the 
minimum  maximum  sidelobe  level  falls  below  25  dB 
below  the  peak  of  the  main  beam  or  the  number  of 
function  calls  exceeds  50,000,  the  algorithm  stops.  The 
GA  was  run  20  independent  times  and  the  results  were 
averaged  for  the  following  population  sizes  and 
mutation  rates: 


Mutation  rate=.01,  .02,  .03,...,  .4 


Figure  6.  Plot  of  the  average  number  of  function 
calls  used  by  a  GA  to  find  the  minimum  maximum 
sidelobe  amplitude  taper  of  an  18  element  linear 
array.  The  GA  was  run  for  up  to  500  averages  on 
three  independent  occasions. 


Figure  7.  This  plot  magnifies  the  left  region  of  the 
graph  in  Figure  6. 

Figure  8  displays  a  plot  of  the  average  number  of 
function  calls  vs.  population  size  and  mutation  rate 
when  the  results  were  averaged  over  20  independent 
runs.  This  graph  is  very  low  when  the  mutation  rate  is 
less  than  20%,  except  for  a  subregion  where  the 
population  size  and  mutation  rate  are  small.  Figure  9 
shows  another  result  where  20  independent  runs  were 
averaged  and  the  population  size  varied  from  4  to  128 
and  the  mutation  rate  was  between  1  and  19%.  This  plot 
shows  the  minimum  number  of  function  calls  gradually 
increases  as  population  size  increases.  GAs  take  a  long 
time  to  converge  when  the  population  size  is  small  and 
the  mutation  rate  is  small  because  population  diversity 
comes  at  a  slower  rate. 


Population  size  =  4,  8, 12, . . .,  64 
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averaged  over  20  runs 


population  size  mutation  rate 

Figure  8.  The  GA  performed  best  (used  the  lowest 
number  of  function  calls)  when  the  population  size 
was  small  and  the  mutation  rate  around  10%. 


population  sizes  and  a  population  size  of  4  with 
mutation  rate  of  20%  was  predicted  in  Figure  10. 

In  order  to  become  more  confident  with  the  results 
presented  in  the  previous  figures,  the  GA  was  averaged 
over  500  runs  for  several  different  mutation  rates  and 
population  sizes  as  shown  in  Table  1.  Results  (in 
number  of  function  calls)  from  running  a  GA  200  times 
to  find  the  optimum  amplitude  taper  for  an  18  element 
array  that  minimizes  the  maximum  sidelobe  level.  A 
single  GA  run  stopped  when  the  sidelobe  level  went 
below  -25  dB  or  the  number  of  function  calls  exceeded 
50,000.  The  minimum  and  maximum  number  of 
function  calls  over  the  200  runs  as  well  as  the  mean, 
and  standard  deviation  of  the  number  of  function  calls 
are  shown  here.  A  population  size  of  4  with  a  mutation 
rate  of  15%  produced  the  best  average  results. 


averaged  over  20  runs 


population  size 


Figure  9.  The  lower  mutation  was  run  again  and  the 
range  of  population  sizes  was  increased. 


A  strong  region  of  performance  in  Figure  8  occurs 
between  a  population  size  of  4  to  16  and  a  mutation  rate 
of  0.1  to  0.2.  Figure  10  shows  this  region  when  the  GA 
is  averaged  over  50  runs.  The  plot  shows  a  population 
size  of  8  or  less  and  a  mutation  rate  of  13%  or  less 
produce  excellent  results.  Still  afraid  of  abandoning 
conventional  wisdom,  the  region  between  a  population 
size  of  4  and  128  and  a  mutation  rate  of  0.0  to  0.05  is 
examined,  averaging  the  GA  over  50  runs.  Results, 
shown  in  Figure  1 1,  are  best  for  the  smallest  population 
sizes  and  mutation  rate  of  5%.  Again,  the  region  of  low 
population  size  and  low  mutation  rate  yields  slow 
convergence.  Avoiding  that  range,  it’s  quite  apparent 
that  the  average  number  of  function  calls  increases  as 
the  population  size  increases.  Mutation  rate  doesn’t 
seem  to  play  much  of  a  factor  above  a  population  size 
of  30.  The  next  best  mean  number  of  calls  was  for  a 
population  size  of  8  and  mutation  rate  of  15%  then 
mutation  rate  of  20%.  These  results  are  consistent  with 
those  in  Figure  8.  The  poor  performance  of  the  large 


averaged  over  50  runs 


Figure  10.  This  graph  shows  that  a  small  population 
size  and  mutation  rate  of  0.1  causes  a  GA  to  find  an 
answer  in  the  fewest  number  of  function  evaluations. 


averaged  over  50  runs 


Figure  11.  Small  population  sizes  and  low  mutation 
rates  cause  the  GA  to  perform  poorly.  Note  that, 
aside  from  very  small  population  sizes,  the  mean 
number  of  function  calls  increases  with  population 
size  independent  of  mutation  rate. 
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Table  1.  Results  (in  number  of  function  calls)  from  running  a  GA  200  times  to  find  the  optimum  amplitude 
taper  for  an  18  element  array  that  minimizes  the  maximum  sidelobe  level.  A  single  GA  run  stopped  when  the 
sidelobe  level  went  below  -25dB  or  the  number  of  function  calls  exceeded  50,000.  The  minimum  and 
maximum  number  of  function  calls  over  the  200  runs  as  well  as  the  mean,  and  standard  deviation  of  the 
number  of  function  calls  for  the  200  runs  are  shown  here. 


Rim 

Mutation  rate 

minimum 

maximum 

mean 

standard  deviation 

i 

0.15 

4 

26 

3114 

398 

455 

2 

0.20 

4 

110 

50002 

7479 

12798 

3 

0.15 

8 

60 

2457 

461 

332 

4 

0.20 

8 

49 

2624 

654 

466 

5 

0.01 

64 

300 

50031 

1158 

3498 

6 

0.02 

64 

277 

11818 

1028 

911 

7 

0.01 

128 

393 

2535 

1410 

365 

8 

0.02 

128 

1215 

50071 

10208 

16077 

IV,  Optimizing  Side  Lobe  Tapers  - 
Example  2 

The  next  example  finds  a  low  sidelobe  taper  for  a  linear 
array.  Table  2  shows  the  results  (in  number  of  function 
calls)  from  running  a  GA  100  times  to  find  the  optimum 
phase  taper  that  minimizes  the  maximum  sidelobe  level 
of  a  40  element  array.  A  single  GA  run  stopped  when 
the  sidelobe  level  went  below  -14dB  or  the  number  of 
function  calls  exceeded  50,000.  The  minimum  and 
maximum  number  of  function  calls  over  the  100  runs  as 
well  as  the  mean,  and  standard  deviation  of  the  number 
of  function  calls  for  the  100  runs  are  shown  here.  Once 
again  the  number  of  function  calls  is  smallest  for  the 
smaller  population  sizes  coupled  with  relatively  large 
mutation  rates. 

It  should  be  noted  that  even  for  the  best  parameters 
used  in  these  tables,  not  all  runs  converged  as 
evidenced  by  the  maximum  entries  greater  than  50,000. 
This  fact  has  two  implications.  The  first  is  that  the 


mean  number  of  function  calls  in  the  table  would 
actually  be  higher  if  a  limit  were  not  in  place.  The 
second  implication  is  that  one  should  always  be 
prepared  to  do  multiple  runs  when  using  a  GA  since 
convergence  is  not  assured. 

V,  Optimizing  Side  Lobe  Tapers  - 
Example  3 

In  this  example,  a  GA  is  run  for  100,000  function 
evaluations  in  order  to  find  the  optimum  amplitude 
taper  for  a  20  element  array  that  minimizes  the 
maximum  sidelobe  level.  Table  3  shows  the  results  in 
dB.  The  minimum  and  maximum  result  as  well  as  the 
mean  and  standard  deviation  of  the  best  sidelobe  level 
for  the  100  runs  are  shown  here.  The  population  size  of 
4  and  8  with  15%  mutation  rate  outperformed  the  GA’s 
with  population  sizes  of  64  and  128  with  a  mutation 
rate  of  2%. 


Table  2.  Results  (in  number  of  function  calls)  from  running  a  GA  100  times  to  find  the  optimum  phase  taper 
that  minimizes  the  maximum  sidelobe  level  of  a  40  element  array.  A  single  GA  run  stopped  when  the  sidelobe 
level  went  below  -14dB  or  the  number  of  function  calls  exceeded  50,000.  The  minimum  and  maximum 
number  of  function  calls  over  the  100  runs  as  well  as  the  mean,  and  standard  deviation  of  the  number  of 
function  calls  for  the  100  runs  are  shown  here. 


Run 

Mutation  rate 

minimum 

maximum 

mean 

standard  deviation 

i 

0.15 

4 

134 

50002 

2973 

5856 

2 

4 

163 

50000 

5232 

9744 

3 

0.15 

8 

168 

8223 

1827 

1510 

4 

0.20 

8 

124 

21307 

3220 

3604 

5 

64 

614 

50024 

7914 

15040 

6 

0.02 

64 

546 

50036 

6624 

13130 

7 

0.01 

128 

955 

50043 

4791 

9708 

8 

0.02 

128 

933 

50033 

3942 

7636 
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Table  3.  Results  (in  dB)  from  running  a  GA  for  100,000  function  evaluations  in  order  to  find  the  optimum 
amplitude  taper  for  a  20  element  array  that  minimizes  the  maximum  sidelobe  level.  The  minimum  and 
maximum  result  as  well  as  the  mean,  and  standard  deviation  of  the  best  sidelobe  level  for  the  100  runs  are 
shown  here. 


Run 

Mutation  rate 

Population  size 

minimum 

maximum 

mean 

standard  deviation 

1 

0.15 

4 

-57.5 

-28.4 

-36.1 

4.6 

3 

0.15 

8 

-46.0 

-29.5 

-36.5 

3.3 

6 

0.02 

64 

-42.5 

-27.1 

-32.5 

3.3 

8 

0.02 

128 

-41.2 

-28.0 

-32.5 

2.5 

VI.  Conclusions 

The  results  of  the  numerical  experiments  presented  in 
this  paper  suggest  that  the  best  mutation  rate  for  GAs 
used  on  these  problems  lies  between  5  and  20%  while 
the  population  size  should  be  less  than  16.  These  results 
disagree  with  some  of  the  previous  studies  cited  and 
common  usage.  The  primary  reasons  for  these  results 
are  that  off-line  performance  was  used  and  that  a 
broader  range  of  population  size  and  mutation  rate  was 
included.  In  addition,  the  criteria  judged  here  is  the 
number  of  function  evaluations,  which  is  a  good 
indicator  of  the  amount  of  computer  time  required  to 
solve  the  problem. 

A  way  to  interpret  these  results  is  in  the  context  of 
analyzing  the  trade-offs  between  exploration  versus 
exploitation.  Traditionally,  large  populations  have  been 
used  to  thoroughly  explore  complicated  cost  surfaces. 
Crossover  is  then  the  operator  of  choice  to  exploit 
promising  regions  of  phase  space  by  combining 
information  from  promising  solutions.  The  role  of 
mutation  is  somewhat  nebulous.  As  stated  by  Back  [8], 
mutation  is  typically  considered  as  a  secondary  operator 
of  little  importance.  Like  us,  he  found  that  larger  values 
than  typically  used  are  best  for  the  early  stages  of  a  GA 
run.  In  one  sense,  greater  exploration  is  achieved  if  the 
mutation  rate  is  great  enough  to  take  the  gene  into  a 
different  region  of  solution  space.  Yet  a  mutation  in  the 
less  critical  genes  may  result  in  further  exploiting  the 
current  region.  Perhaps  the  larger  mutation  rates 
combined  with  the  lower  population  sizes  act  to  cover 
both  properties  without  the  large  number  of  function 
evaluations  required  for  large  population  sizes.  Iterative 
approaches  where  mutation  rate  varies  over  the  course 
of  a  run  such  as  done  by  Back  [8,  9]  and  Davis  [15]  are 
likely  optimal,  but  require  a  more  complex  approach 
and  algorithm.  Note  that  when  real  parameters,  small 
population  sizes,  large  mutation  rates,  and  an  adaptive 
mutation  rate  are  used,  the  algorithm  begins  to  lurk 
more  in  the  realms  of  what  has  been  traditionally 
referred  to  as  evolutionary  strategies.  We  feel  that 


names  are  a  mute  point  and  choose  to  do  what  we  find 
works  best  for  a  problem.  In  particular,  we  prefer  the 
engineering  approach  of  switching  to  a  different 
optimization  algorithm  once  the  global  well  is  found, 
since  at  that  point  the  more  traditional  optimization 
algorithms  become  more  efficient. 

When  the  population  sizes  are  as  small  as  found  here, 
tournament  selection  offers  no  advantage  to  roulette 
wheel  selection,  so  an  evaluation  of  the  trade-off 
between  these  selection  operators  was  not  done. 
Selecting  a  small  population  size  takes  a  very  small 
amount  of  computer  time.  When  doing  the  calculations 
for  Table  3,  the  GA  runs  with  large  population  size  took 
at  least  10%  longer  to  run  than  the  GAs  with  small 
population  sizes  for  a  fixed  number  of  function  calls. 
This  difference  can  be  attributed  to  the  weighting  and 
ranking  in  the  selection  operator. 

These  results  are  not  totally  alone.  They  are  confirmed 
by  our  own  prior  results  in  [1 1]  as  well  as  those  of  Back 
[7,  8,  9]  and  predicted  by  the  theory  of  Gao  [10].  Also 
De  Jong  [3]  found  that  a  small  population  size  and  high 
mutation  rate  worked  best  during  the  initial  generations 
and  off-line  performance.  This  is  consistent  with  the 
results  here  since  the  algorithm  is  stopped  when  a 
prescribed  minimum  in  the  valley  of  the  true  minimum 
is  found.  If  the  GA  were  then  used  to  pass  results  to  a 
local  optimizer,  the  GA  need  only  work  on  the  problem 
a  short  time. 

Although  these  conclusions  strictly  apply  to  only  the 
problems  presented,  in  practice  we  have  found  many 
other  problems  where  similar  principles  applied.  No 
attempt  has  been  made  to  thoroughly  investigate  all 
possible  combinations  of  parameters.  We  chose  to 
concentrate  on  population  size  and  mutation  rate  after 
our  own  experience  with  optimizing  GA  performance. 
We  make  no  claims  that  this  is  a  definitive  analysis:  our 
purpose  is  merely  to  suggest  that  future  GA 
practitioners  consider  a  wider  range  of  possible 
combinations  of  population  size  and  mutation  rate. 
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Abstract  -  Hybrid  FEM/MoM  methods  combine  the 
finite  element  method  (FEM)  and  the  method  of 
moments  (MoM)  to  model  inhomogeneous  unbounded 
problems.  These  two  methods  are  coupled  by  enforcing 
field  continuity  on  the  boundary  that  separates  the  FEM 
and  MoM  regions.  There  are  three  ways  of  formulating 
hybrid  FEM/MoM  methods:  outward-looking 

formulations,  inward-looking  formulations  and 
combined  formulations.  In  this  paper,  the  three 
formulations  are  compared  in  terms  of  computer- 
resource  requirements  and  stability  for  four  sample 
problem  geometries.  A  novel  preconditioning  technique 
is  developed  for  the  outward-looking  formulation.  This 
technique  greatly  improves  the  convergence  rate  of 
iterative  solvers  for  the  types  of  problems  investigated  in 
this  study. 

Index  Terms'.  FEM,  MoM,  EMC,  sparse  matrix, 
permutation,  preconditioning,  iterative  solvers. 

I.  INTRODUCTION 

Hybrid  FEM/MoM  methods,  which  are  also  referred  to 
as  FE-BI,  FE-MM,  or  FEM/BEM  in  the  literature,  combine 
the  finite  element  method  (FEM)  and  the  method  of 
moments  (MoM)  to  model  inhomogeneous  unbounded 
problems.  FEM  is  used  to  analyze  the  details  of  the  structure 
and  MoM  is  employed  to  terminate  the  FEM  meshes  and  to 
provide  an  exact  radiation  boundary  condition  (RBC).  These 
two  methods  are  coupled  by  enforcing  tangential-field 
continuity  on  the  boundary  separating  the  FEM  and  MoM 
regions.  Hybrid  FEM/MoM  techniques  were  introduced  in 
the  early  seventies  by  Silvester  and  Hsieh  [1],  and 
McDonald  and  Wexler  [2]  as  attempts  to  apply  FEM  to 
model  unbounded  radiation  problems.  FEM/MoM  was  not 
widely  used  until  the  late  eighties  due  to  its  large 
computational  requirements.  Yuan  [3],  and  Jin  and  Volakis 
[4],  [5]  were  among  the  first  to  apply  FEM/MoM  to  3D 
electromagnetic  problems  using  vector  basis  functions. 
Angelini  et  al  [6],  and  Antilla  and  Alexopoulos  [7]  later 
applied  FEM/MoM  to  3D  scattering  in  anisotropic  media. 

FEM/MoM  has  been  used  to  analyze  electromagnetic 
compatibility  (EMC)  problems  since  the  mid-nineties. 
Ali  et  al  [8]  employed  FEM/MoM  to  analyze  scattering  and 
radiation  from  structures  with  attached  wires.  Shen  and  Kost 
[9]  used  FEM/MoM  to  analyze  EMC  problems  in  power 
cable  systems.  FEM/MoM  has  also  been  utilized  to  model 
thin  shielding  sheets  and  microstrip  lines  [10],  [11]. 
Electronic  devices  with  printed  circuit  boards  (PCBs)  are 
usually  composed  of  many  detailed  structures:  dielectrics, 


traces,  cables,  holes  and  vias.  MoM  is  not  well  suited  to 
model  this  kind  of  complex  geometry  efficiently.  With  a 
hybrid  FEM/MoM  technique,  the  details  of  a  printed  circuit 
board  can  be  modeled  using  FEM  and  an  exact  radiation 
boundary  can  be  provided  using  MoM  to  terminate  the  FEM 
meshes.  When  the  structure  has  long  cables,  a  FEM/MoM 
method  is  particularly  efficient  because  the  cables  can  be 
modeled  by  MoM  without  meshing  the  empty  space  around 
the  cable. 

There  are  three  formulations  for  hybrid  FEM/MoM 
methods  [12]-[14].  The  first  formulation  constructs  an  RBC 
using  MoM  and  incorporates  the  RBC  into  the  FEM 
equations.  The  second  formulation  derives  an  RBC  from 
FEM  and  incorporates  the  RBC  into  the  MoM  equations. 
The  third  formulation  combines  the  FEM  and  MoM  matrix 
equations  to  form  a  large  matrix  equation  and  solves  for  all 
unknowns  simultaneously.  The  first  and  second 
formulations  are  referred  as  outward-looking  and  inward¬ 
looking ,  respectively,  in  [13],  [14].  The  last  formulation  is 
referred  to  as  the  combined  formulation  in  this  paper. 

This  paper  compares  the  three  formulations  for  hybrid 
FEM/MoM  methods  and  presents  a  novel  preconditioning 
technique  that  can  be  applied  to  outward-looking 
formulations.  Section  II  describes  the  matrix  equations 
generated  by  FEM/MoM.  Section  III  introduces  four  sample 
problems  used  to  compare  the  three  formulations.  In  Section 
IV,  preconditioning  and  permutation  techniques  are 
presented.  Section  V  presents  the  outward-looking 
formulation  and  the  new  preconditioning  technique.  The 
inward-looking  formulation  is  described  in  Section  VI. 
Section  VII  presents  the  combined  formulation.  Section  VIII 
compares  the  three  formulations  in  terms  of  computer 
resource  requirements.  Finally,  conclusions  are  drawn  in 
Section  IX. 

II.  MATRIX  EQUATIONS  GENERATED  BY  FEM/MoM 

Full-wave  hybrid  FEM/MoM  methods  are  well  suited 
for  solving  problems  that  combine  small  complex  structures 
and  large  radiating  conductors.  The  original  problem  must 
be  divided  into  an  exterior  equivalent  problem  and  an 
interior  equivalent  problem.  MoM  is  used  to  model  the 
exterior  equivalent  problem  and  FEM  is  employed  to 
analyze  the  interior  equivalent  problem.  The  two  equivalent 
problems  are  related  by  enforcing  the  continuity  of 
tangential  fields  on  the  boundary  separating  the  FEM  and 
MoM  regions  [14]-[16]. 

The  electric-field  integral-equation  (EFIE)  is  generally 
used  to  describe  the  exterior  equivalent  problem  [17], 
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Einc(r)  =  ~E(r)  + 

f  M(r')  x  V'Go(r>r/)+  j  ko7]0J(r')  Go(r,r') 


“f 

s 


7^V'.J(r')V'G0(r,r') 

&0 


(1) 


where  ko  and  7)0  are  the  wavenumber  and  the  intrinsic  wave 
impedance  in  free-space,  and  S  is  the  surface  enclosing  the 
exterior  equivalent  problem.  The  integral  term  with  a  bar  in 
Equation  (1)  denotes  a  principal-value  integral.  The 
singularity  at  r=r'  is  excluded.  The  three-dimensional 
homogeneous  Green’s  function  is  given  by, 


Go(r,r')  = 


e’fk  °|r~r1 
r-r'| 


(2) 


If  S  is  a  closed  surface,  the  EFIE  is  not  immune  to  false 
interior  resonances  [15],  [17],  [18].  If  the  interior  resonances 
cause  serious  problems,  the  combined  field  formulation  may 
be  employed  [12],  [18]. 

Triangular  basis  functions  (RWG  functions)  [19]  may 
be  employed  to  approximate  surface  fields.  A  Galerkin 
procedure  can  be  used  to  test  Equation  (1).  The  resulting 
MoM  matrix  equation  follows  [8], 


Chh 

Che 

~Jh 

'Dm 

O' 

’ Ed ‘ 

'K 

Cch 

Ccc 

.Jc. 

Dcd 

0_ 

_0  _ 

Jc. 

where  {Jh}  and  {7C}  are  sets  of  unknowns  for  the  electric 
current  densities  on  the  dielectric  surface  and  perfect- 
electric-conductor  (PEC)  surface,  respectively;  {£<*}  is  a  set 
of  unknowns  for  the  electric  field  on  the  dielectric  surface; 
Chh,  ,  Cch ,  Ccc,  Dhd  and  Dcd  are  dense  coefficient  matrices; 
Fh  and  Fc  are  source  terms.  The  matrix  formed  by  Cm,  C ^ 
Cch  and  Ccc  in  Equation  (3)  is  called  the  MoM  matrix  or 
matrix  C  in  this  paper. 

The  interior  equivalent  problem  is  modeled  using  FEM. 
The  goal  is  to  solve  the  weak  form  of  the  vector  wave 
equation  as  follows  [14],  [20].  (This  equation  can  also  be 
derived  using  a  variational  approach  [16],  [21].) 


J 

v, 


r  VxE(r) 
Jcon0juT 


(V  x  w(r ))  +  ;  co  e0€r  E(r )  •  w(r ) 


dV 


=  |  (n  x  H(r) )  •  w(r)  dS  -  J  Jint  (r )  •  w(r)  dV  (4) 

Si  v, 

where  Si  is  the  surface  enclosing  the  interior  equivalent 
problem,  w(r)  is  the  weighting  function,  and  Jmt  is  an 
impressed  source.  Vector  tetrahedral  elements  [22]  can  be 
used  to  approximate  the  E  field.  A  Galerkin  procedure  can 
be  used  to  test  Equation  (4).  The  resulting  FEM  matrix 
equation  follows  [8], 


Aii 

Aid 

'  Ei 

'0 

O' 

'  O' 

+ 

Si 

_Adi 

Add  _ 

Jd. 

0 

Bdh. 

Jh. 

_Sd. 

where  {£,}  and  {£^}  are  sets  of  unknowns  for  the  electric 
field  within  the  FEM  volume  and  on  the  dielectric  surface, 
respectively;  {Jh}  is  a  set  of  unknowns  for  the  electric 
current  density  on  the  dielectric  surface;  Aih  Aid,  Adif  AM  and 
Bfa  are  sparse  coefficient  matrices;  g,  and  gd  are  source 
terms.  The  matrix  formed  by  Aih  Aidi  Adi,  and  Am  in 
Equation  (5)  is  called  the  FEM  matrix  or  matrix  A  in  this 
paper.  Both  the  FEM  and  the  MoM  matrices  are  symmetric. 
Note  that  neither  the  FEM  matrix  equation  nor  the  MoM 
matrix  equation  can  be  solved  independently.  They  are 
coupled  through  the  Jh  and  Ed  terms. 

One  objective  of  this  study  is  to  determine  which 
formulation  works  best  for  various  problems.  A  coupling 

index,  p,  is  defined  in  this  paper  as  follows, 

Number  of  FEM  unknowns  ... 

P= - .  (6) 

Number  of  MoM  unknowns 

The  value  of  p  is  determined  by  the  problem  geometry  and 
how  it  is  meshed.  As  shown  in  later  sections,  the  coupling 
index  p  can  be  used  as  a  rough  measure  to  determine  which 
formulation  is  preferred  for  a  given  problem. 

m.  SAMPLE  PROBLEMS 

Four  sample  problems  are  presented  to  compare  the 
outward-looking,  inward-looking  and  combined 
formulations  and  to  validate  the  preconditioning  techniques 
discussed  in  later  sections.  Three  of  the  problems  include 
PCB  structures,  which  are  key  elements  of  devices  that  are 
frequently  modeled  by  EMC  and  signal  integrity  (SI) 
engineers.  Each  of  these  three  problems  has  a  thin 
rectangular  shape  and  presents  a  unique  challenge.  The 
remaining  problem  has  a  spherical  shape  and  provides  a 
contrast  to  the  PCB-like  structures. 

A.  Problem  1:  A  PCB  Power  Bus  Structure 

The  first  problem  is  to  model  the  input  impedance  of  a 
PCB  power  bus  structure.  As  shown  in  Figure  1,  the  board 
dimensions  are  5  cm  x  5  cm  x  1.1  mm.  The  top  and  bottom 
planes  are  PECs.  The  dielectric  between  the  PEC  layers  has 
a  relative  dielectric  constant  of  4.5.  A  source  is  placed  in  the 
middle  of  the  board  between  the  planes.  The  MoM 
boundary  is  chosen  to  coincide  with  the  physical  boundary 
of  the  board.  The  E  fields  tangential  to  the  top  and  bottom 
planes  are  zero,  thus  no  E-field  unknowns  are  assigned  on 
the  two  planes  and  the  number  of  FEM  unknowns  is  small. 
Table  1  summarizes  the  discretization  of  this  problem  and 
the  other  problems  presented  in  this  section. 

B.  Problem  2:  Scattering  from  a  Dielectric  Sphere 

The  second  problem  is  to  model  the  scattering  fields 
from  a  dielectric  sphere.  As  illustrated  in  Figure  2,  the 
radius  of  the  sphere  is  0.15k.  The  relative  dielectric  constant 
of  the  sphere  material  is  4.5.  The  incident  wave  travels 
along  the  z-axis.  The  polarization  of  the  E  field  is  along  the 
x-axis.  The  goal  is  to  model  the  far  fields.  The  discretization 
of  this  problem  is  summarized  in  Table  1 . 
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Figure  1 .  A  PCB  power  bus  structure. 


terminated  by  a  resistor.  The  discretization  of  this  problem 
is  summarized  in  Table  1.  To  reduce  the  number  of 
boundary  elements,  the  MoM  boundary  is  placed  3.3  mm 
above  the  microstrip  line.  A  fine  FEM  mesh  is  required  near 
the  vicinity  of  the  microstrip  line  as  shown  in  Figure  5.  As  a 
result,  this  problem  has  a  large  coupling  index. 

5.1  mm 


— H  I— 


Figure  3.  Configuration  of  a  gapped  power  bus  structure. 

5  cm 


Figure  2.  Scattering  from  a  dielectric  sphere. 

C.  Problem  3:  A  Gapped  Power  Bus  Structure 

The  third  problem  is  to  model  a  gapped  power  bus 
structure.  As  shown  in  Figure  3,  the  board  dimensions  are 
152.4  mm  x  101.6  mm  x  2.39  mm.  The  board  has  a  solid 
PEC  plane  on  the  bottom  and  a  gapped  PEC  plane  on  the 
top.  The  dielectric  between  the  top  and  bottom  planes  has  a 
relative  permittivity  of  4.5.  The  gap  is  5.1  mm  wide  and 
located  in  the  center  of  the  top  plane.  The  discretization  of 
this  problem  is  summarized  in  Table  1 .  This  board  is  much 
larger  than  the  board  in  Problem  1.  A  fine  mesh  is  used  in 
the  vicinity  of  the  gap.  To  reduce  the  number  of  MoM 
elements,  the  MoM  boundary  is  placed  9.56  mm  above  the 
gap,  resulting  in  a  large  number  of  FEM  unknowns. 

D.  Problem  4:  A  Microstrip  Line 

The  fourth  problem  is  to  model  the  behavior  of  a 
microstrip  line.  The  board  dimensions  are  5  cm  x  5  cm  x  1.1 
mm  as  shown  in  Figure  4.  The  bottom  is  a  solid  PEC  plane. 
The  trace  placed  on  the  top  plane  is  3  cm  long  and  0.5  mm 
wide.  The  dielectric  has  a  relative  permittivity  of  4.5.  The 
goal  of  this  problem  is  to  determine  the  input  impedance  of 
the  microstrip  line  at  one  end  when  the  other  end  is 


Figure  4.  A  microstrip  line  configuration. 


Figure  5.  The  FEM  mesh  in  the  plane  of  the  trace. 


Table  1.  Summary  of  the  discretization  of  the  four  sample  problems 


#  of  FEM  unknowns 

#  of  MoM  unknowns 

Total  #  of 

Coupling  index  p 

Ei 

Ed 

4 

Jc 

unknowns 

Problem  1 

402 

80 

575 

1,137 

0.74 

Problem  2 

699 

612 

612 

0 

1,923 

2.14 

Problem  3 

4,521 

1,223 

1,223 

454 

7,421 

3.43 

Problem  4 

2,277 

360 

360 

136 

3,133 

5.32 

106 


ACES  JOURNAL,  VOL.  15,  NO.  2,  JULY  2000 


IV.  TECHNIQUES  FOR  SOLVING  SPARSE  MATRIX 
EQUATIONS 

A.  Preconditioning 

Iterative  solvers  are  widely  used  to  solve  large  sparse 
matrix  equations  of  the  form, 

Mx  =  b  (7) 

where  M  is  a  square  matrix  and  b  and  ;t  are  column  vectors. 
b  is  the  source  vector  and  x  is  the  unknown  vector. 
Equation  (7)  is  also  called  a  linear  system. 

To  have  a  non-trivial  solution,  the  matrix  M  must  be 
non-singular  (det(M)^O).  The  convergence  rate  of  iterative 
solvers  depends  mainly  on  the  condition  number  of  the 
matrix  M,  which  is  defined  as  [14], 

K(M)s=y (8) 

where  Xadn  and  are  the  smallest  and  largest 

eigenvalues  of  the  matrix  Mn M  ,  where  M^is  the 
transpose  conjugate  of  M.  The  condition  number  provides  a 
measure  of  the  spectral  properties  of  a  matrix.  The  identity 
matrix  has  a  condition  number  of  1.0.  A  singular  matrix  has 
a  condition  number  of  infinity.  A  matrix  with  a  large 
condition  number  is  nearly  singular,  and  is  called  ill- 
conditioned.  An  ill-conditioned  linear  system  is  very 
sensitive  to  small  changes  in  the  matrix.  Iterative  solvers 
may  not  converge  smoothly,  or  may  even  diverge  when 
applied  to  ill-conditioned  systems. 

The  coefficient  matrices  generated  by  FEM  and  MoM 
usually  have  very  large  condition  numbers.  It  may  be 
difficult  to  apply  iterative  solvers  to  the  original  FEM  and 
MoM  matrix  equations.  However,  a  linear  system  can  be 
transformed  into  another  linear  system  so  that  the  new 
system  has  the  same  solution  as  the  original  one,  but  has 
better  spectral  properties.  For  instance,  both  sides  of 
Equation  (7)  can  be  multiplied  by  a  square  matrix  P , 

P~'Mx  =  P'lb  (9) 

where  P  has  the  following  properties, 

(A)  K(P~lM)  «  K(M ) 

(B)  det(J’~1Jlf)*0 

(C)  It  is  inexpensive  to  solve  Px  =  b. 

Such  a  matrix  P  is  called  a  preconditioner.  This  technique  is 
called  preconditioning.  Condition  (A)  assures  favorable 
spectral  properties  for  the  new  linear  system.  Condition  (B) 
guarantees  that  the  new  system,  Equation  (9),  has  the  same 
non-trivial  solution  as  Equation  (7).  Condition  (C)  is 
essential  to  ensure  the  efficiency  of  preconditioned  iterative 
solvers.  In  preconditioned  iterative  algorithms,  it  is  not 

necessary  to  solve  P~l  explicitly.  Instead,  a  linear  system 
of  the  form  Px  =  b  is  solved  at  each  step. 


If  the  preconditioner  P  is  chosen  to  be  M,  P  lM 

becomes  an  identity  matrix.  However,  finding  M  ~1  is 
generally  more  difficult  than  solving  Equation  (7).  It  is  more 
practical  to  find  a  preconditioner  P  that  is  an  approximation 
of  M,  and  satisfies  all  three  conditions.  There  is  a  trade-off 
between  the  cost  of  constructing  and  applying  the 
preconditioner,  and  the  gain  in  the  convergence  rate  [23]. 

LU  factorization  and  incomplete  LU  (ILU)  factorization 
are  commonly  used  to  construct  preconditioners.  LU 
factorization  decomposes  a  matrix  M  into  a  lower  triangular 
matrix  L  and  an  upper  triangular  matrix  [/,  which  satisfy, 

M  =  LU.  (10) 

ILU  factorization  ([23],  [24]),  decomposes  matrix  M  into  a 
lower  triangular  matrix  L  and  an  upper  triangular  matrix  U 
so  that  the  residue  matrix  R  =  M-LU  is  subject  to  certain 
constraints,  such  as  levels  of  fill-in,  or  drop  tolerance. 

B.  Permutation 

Because  the  FEM  matrix,  A,  is  sparse,  LU  factorization 
may  generate  a  lot  of  fill-in  elements ,  which  refer  to  matrix 
entries  that  are  zero  in  the  matrix  A  but  are  non-zero  in  the  L 
and  U  matrices  [24].  Permutation  is  a  technique  that  can  be 
used  to  reduce  the  number  of  fill-ins  in  LU  factorization  by 
reordering  the  matrix.  Generally,  a  symmetric  permutation 
on  matrix  M  is  defined  as  follows  [24], 

Mp  =  P  M  PT  (11) 

where  M P  is  the  new  matrix  after  permutation  and  P  is  the 

permutation  matrix.  P  is  a  unitary  matrix  [24],  which 
satisfies, 

P~'  =  PT .  (12) 

Figure  6  illustrates  the  sparsity  pattern  of  the  original 
FEM  matrix  in  Problem  1 .  The  number  of  unknowns  in  the 
FEM  matrix  is  482.  A  fully  populated  matrix  has  482x482  = 
232,324  entries.  Figure  6  shows  only  3,772  non-zero  entries. 
The  percentage  of  non-zero  elements  is  1.6%,  indicating 
that  the  FEM  matrix  is  highly  sparse.  Figure  7  illustrates  the 
sparsity  patterns  of  the  L  and  U  matrices  after  applying  LU 
factorization  to  the  FEM  matrix  in  Problem  1 .  The  data  in 
Figure  7  was  generated  using  MATLAB®  [25].  The  L 
matrix  obtained  by  MATLAB  is  a  "psychologically  lower 
triangular  matrix”  (i.e.  a  product  of  lower  triangular  and 
permutation  matrices)  [26].  This  explains  why  the  L  matrix 
is  not  a  strictly  lower  triangular  matrix.  The  total  number  of 
non-zero  entries  in  L  and  U  is  34,640  +  35,379  =  70,019. 
The  total  number  of  fill-ins  is  70,019-3772  =  66,247. 

The  reverse  CuthilUMcKee  algorithm  can  be  used  to 
minimize  the  bandwidth  of  a  matrix  [16],  [27].  Bandwidth 
reduction  techniques  are  useful  because  they  save  both 
storage  and  operation  counts  in  LU  factorization.  Figure  8 
shows  the  sparsity  pattern  of  the  FEM  matrix  in  Figure  6 
after  performing  a  symmetric  reverse  Cuthill-McKee 
permutation.  Figure  9  illustrates  the  sparsity  patterns  of  the 


Jl,  WANG,  HUBING:  A  NOVEL  PRECONDITIONING  TECHNIQUE  &  COMPARISON  OF  THREE  FORMULATIONS 


107 


L  and  U  matrices  after  the  permutation.  The  number  of  fill- 
ins  is  10,457+12,457  -  3,772  =  19,142.  Compared  with 
Figure  7,  the  number  of  fill-ins  has  been  reduced  by  71%. 

The  minimum  degree  permutation  is  a  complicated  and 
powerful  technique  that  has  many  advantages  over  other 
permutation  techniques  [16],  [26].  One  widely  used 
implementation  was  proposed  by  George  and  Liu  [28].  This 
technique  reduces  fill-ins  during  Gaussian  elimination  based 
on  graph  theory  [16],  [29].  In  this  study,  the  authors  used 
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Figure  6.  Sparsity  pattern  of  Problem  1  FEM  matrix  (“nz” 
is  #  of  non- zero  entries). 
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Figure  7.  Sparsity  pattern  of  the  Problem  1  L  and  U 
matrices  after  LU  factorization 
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Figure  8.  Sparsity  pattern  of  the  Problem  1  FEM  matrix 
after  symmetric  reverse  Cuthill-McKee  permutation. 


Hie  original  FEM  matrix 


the  symmetric  minimum  degree  permutation  provided  by 
MATLAB®.  Figure  10  shows  the  sparsity  pattern  of  the 
FEM  matrix  in  Figure  6  after  performing  the  symmetric 
minimum  degree  permutation.  Figure  11  illustrates  the 
sparsity  patterns  of  the  L  and  U  matrices  after  performing 
the  symmetric  minimum  degree  permutation.  The  number  of 
fill-ins  is  7,901+9,628  -  3,772  =  13,757.  Compared  with 
Figure  7,  the  number  of  fill-ins  has  been  reduced  by  79%. 


the  lower  triangular  matrix  the  upper  triangular  matrix 
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Figure  9.  Sparsity  pattern  of  the  Problem  1  L  and  U 
matrices  after  symmetric  reverse  Cuthill-McKee 
permutation. 

After  symmetric  minimum  degree  permutation 
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Figure  10.  Sparsity  pattern  of  the  Problem  1  FEM  matrix 
after  symmetric  minimum  degree  permutation. 
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Figure  11.  Sparsity  pattern  of  the  Problem  1  L  and  U 
matrices  after  symmetric  minimum  degree  permutation. 
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V.  THE  OUTWARD-LOOKING  FORMULATION  AND 
A  NOVEL  PRECONDITIONING  TECHNIQUE 


The  outward-looking  formulation  uses  the  coefficients 
of  the  electric  field  expansion  in  the  interior  equivalent 
problem,  £,  and  Ed  in  Equation  (5),  as  the  primary 
unknowns  in  the  final  matrix  equation.  This  formulation  has 
been  employed  by  Paulsen  et  al  [31],  Jin  and  Volakis  [32], 
and  Ramahi  and  Mittra  [33]. 

From  Equation  (3),  the  following  equations  can  be 
derived, 


Cch^h  CccJc  “  ^cd^d  Fc 

=>  Jc  =  C;}(DcdEd-CchJh-Fc)  (13) 

ChhJ h  +  Chc^c  =  EhdEd  ~  Eh  •  (*4) 

Substituting  Equation  (13)  into  Equation  (14)  gives, 

“  Che  Ccc  Cch  V hh 

=  (Dhd  “  ChcCc'c  Dcd)Ed  +  ChcCc'c  Fe-F.  (15) 


To  save  computation  time  and  memory,  the  following 


intermediate  terms  are  introduced, 

N  hc  =  ChcC^l  (16) 

(17) 

D'hds  Dhd  ~  Nfc  Dcd  (18) 

&h  -  NhcFc  -Fh .  (19) 

Equation  (15)  can  now  be  written  as, 

Jh  =C'hh(D'hdEd+Kh).  (20) 


Substituting  Equation  (20)  into  Equation  (5)  gives, 


\a)+ 

- 1 

O 

,  O 

_ i 

\ 

i _ 

\ 

0 

"  FdhChhDhd  ^ 

/ 

vEd\ 

8i 

8d 


0 

DdhC'hhI 


BjhChhKfi 


(21) 


where  the  matrix  A  is  the  FEM  matrix.  Matrix  Ac,  A\  and 
vector  b  are  defined  as  follows, 


0  0 

P  ~  ^dh^hh^hd  _ 


A'  =  A+  Ac 


b  = 


8  i 
8d 


0 

^dh^hh^h 


(22) 

(23) 

(24) 


Equation  (21)  now  becomes, 
A'  x  ~  b  . 


(25) 

(26) 


Equation  (26)  is  a  fully  determined  system  and  is  the 
final  matrix  equation  to  solve.  Note  that  the  order  of  this 
linear  system  is  the  same  as  the  order  of  the  original  FEM 
matrix.  The  Bi-Conjugate  Gradient  Stabilized  (BiCGSTAB) 
method  [23],  [24],  can  be  used  to  solve  Equation  (26). 
Although  BiCGSTAB  requires  less  memory  than  direct 
solvers  such  as  the  Gaussian  elimination  method,  it  may 
have  difficulty  converging,  or  may  even  diverge. 
Preconditioning  techniques  can  be  utilized  to  improve  the 
efficiency  and  accuracy  of  BiCGSTAB.  LU  factorization 
can  be  employed  to  construct  preconditioners. 

As  shown  in  Figure  12,  most  of  the  non-zero  elements 
are  located  in  the  bottom-right  corner  of  matrix  A Table  2 
summarizes  the  number  of  non-zero  entries  in  A,  A\  and 
their  LU  factorizations.  It  is  inefficient  to  perform  LU 
factorization  on  Af  because  the  computer  resources  required 
for  factorization  may  exceed  those  required  for  an  iterative 
solution. 

In  Equation  (23),  the  entries  in  the  matrix  Ac  have  much 
smaller  values  than  those  in  the  matrix  A  for  each  of  the 
sample  problems.  It  seems  reasonable  to  construct 
preconditioners  from  the  matrix  A  instead  of  the  matrix  A'. 
Furthermore,  the  matrix  A  is  sparse  and  symmetric,  so  the 
symmetric  minimum  degree  permutation  can  be  applied  to 
reduce  fill-ins  in  the  LU  factorization, 

Ap  =  PAPt  (27) 

where  P  is  the  permutation  matrix  and  AP  is  the  new  matrix 
after  permutation.  Next,  an  LU  factorization  can  be  applied 
to  AP  to  obtain  a  lower  triangular  matrix  L  and  an  upper 
triangular  matrix  U, 

AP  =  LU.  (28) 

Multiplying  both  sides  of  Equation  (26)  by  P  and  combining 
with  Equation  (12)  gives, 

PA'PTPx  =  Pb  .  (29) 

The  following  new  terms  are  defined, 

A’  =  PA'Pt  (30) 


final  FEM/MoM  matrix 


Figure  12.  Sparsity  pattern  of  Problem  1  A'  in 
Equation  (26). 
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y  =  Px  and  b”=Pb  .  (31) 

Equation  (29)  becomes, 

A'y  =  b '  .  (32) 

Permutation  does  not  change  the  condition  number  of  a 
matrix, 

K(A')  =  K(A').  (33) 

Next,  the  preconditioners  L  and  U  can  be  applied  to 
Equation  (32), 

(. LUrxA’y  =  {LUTxb (34) 


Iterative  solvers  can  be  used  to  solve  Equation  (34).  Note 
that  it  is  not  necessary  to  explicitly  compute  ( LU)~l  when 
using  iterative  solvers  [23],  [24].  After  y  is  obtained,  x  can 
be  calculated  from  Equation  (31), 

x  =  P-ly  =  PTy  .  (35) 

The  technique  discussed  above  was  implemented  using 
MATLAB®.  Table  3  shows  the  condition  number  of  A '  in 
Equation  (26)  and  (LU)~l  A *  in  Equation  (34)  for  all  four 
sample  problems.  This  preconditioning  technique  greatly 
reduced  the  condition  number  of  the  matrix  A '  and  therefore 
improved  the  efficiency  of  the  iterative  solver. 

Table  4  shows  the  solution  times  for  each  of  the  four 
problems  using  the  un-preconditioned  BiCGSTAB  solver 
and  the  preconditioned  BiCGSTAB  solver.  Only  a  small 
amount  of  time  was  spent  constructing  preconditioners.  The 


preconditioning  technique  reduced  the  number  of  iterations 
by  a  factor  ranging  from  202  to  879,  and  achieved  15.9-  to 
149.6-fold  improvements  in  the  Equation  (26)  solution  time. 
Table  5  examines  the  time  spent  on  each  step  of  the  solution 
process  for  the  four  sample  problems  using  the  un¬ 
preconditioned  solver  and  the  preconditioned  solver.  For  the 
first  problem,  there  is  not  much  difference  between  the  un¬ 
preconditioned  and  the  preconditioned  solvers,  because  the 
time  spent  computing  the  matrix  entries  and  on  the  coupling 
process  is  the  dominant  factor.  For  Problems  2,  3,  and  4,  the 
preconditioned  solver  yields  2.21-,  7.83-  and  6.36-  fold 
improvements,  respectively.  The  bottom-right  part  of  Af  is 
dense  as  shown  in  Figure  12  and  is  scattered  after  A '  is 
permuted  as  illustrated  in  Figure  13.  This  is  not  preferred 
because  the  locality  of  data  in  matrix  Ac  is  destroyed  and 
this  has  a  negative  effect  on  the  efficiency  of  the  iterative 
solver.  BiCGSTAB  only  needs  to  compute  the  inner  product 
between  the  matrix  A '  and  the  searching  vector  q.  Because 

A'q-  Aq  +  Acq  ,  (36) 

it  is  not  necessary  to  compute  the  matrix  A '  explicitly.  The 
FEM  matrix  can  be  stored  using  the  ITPACK  format  [16], 
and  the  bottom-right  part  of  Ac  can  be  stored  in  a  two- 
dimensional  array.  Permutation  is  performed  on  the  matrix 
A,  vector  q  and  A^  but  the  matrix  Ac  is  not  permuted.  This 
storage  scheme  makes  it  unnecessary  to  keep  track  of  the 
row  and  column  information  for  every  entry  in  Ac. 
Therefore,  it  uses  much  less  computer  memory  than 
computing  A '  explicitly  and  storing  A'  as  a  sparse  matrix. 


Table  2.  Non-zero  elements  in  A,  A\  and  their  LU  factorizations 


nz(A)* 

nz(A') 

***>  (%) 
nz(A') 

nz(L)+nz(U) 

A  =  LU  ** 

nz(L'  )+nz(  U'  ) 

A'  =  L'U'  *** 

3,772 

9,924 

38% 

17,175 

33,488 

Problem  2 

17,745 

389,229 

4.6% 

192,865 

1,000,728 

Problem  3 

65,558 

1,555,144 

4.2% 

983,322 

2,962,187 

Problem  4 

36,135 

163,829 

22% 

468,849 

798,  028 

nz(A)  refers  to  the  number  of  non-zero  elements  in  matrix  A. 
After  symmetric  minimum  degree  permutation. 

After  symmetric  Cuthill-McKee  permutation. 


Table  3.  Outward-looking  formulation  condition  numbers  before  and  after  preconditioning 


K(A') 

K(  (Lt/)-1  A" ) 

Problem.  1 

8.32X106 

1.07 

Problem.  2 

4.27x10s 

18.7 

Problem.  3 

4.27xl07 

N/A' 

Problem.  4 

5.56x10s 

813 

Data  not  available  due  to  excessive  memory  requirement. 
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Table  4.  Solution  times  for  Equation  (26)  using  the  un-preconditioned  and  preconditioned  BiCGSTAB  solvers  (The  drop 
_ _ tolerance  for  the  BiCGSTAB  solver  is  l.OxlO'3.) _ _ _ 


LU 

Factorization 

(sec) 

Iteration 

Total  (sec) 

Improvement 

(fold) 

Number 

Converged 

(Yes/No) 

Time  (sec) 

Problem  1  (orig.*) 

N/A 

202 

Yes 

2.03 

2.03 

15.9 

Problem  1  (new**) 

0.03 

1 

Yes 

0.09 

0.12 

Problem  2  (orig.*) 

N/A 

715 

Yes 

206.10 

206.10 

58.1 

Problem  2  (new**) 

1.63 

2 

Yes 

1.92 

3.55 

Problem  3  (orig.*) 

N/A 

5,096 

Yes 

6,037.90 

6,037.9 

149.6 

Problem  3  (new**) 

12.06 

9 

Yes 

28.03 

40.09 

Problem  4  (orig.*) 

N/A 

2,637 

No 

386.77 

386.77 

46.7 

Problem  4  (new**) 

5.19 

3 

Yes 

2.91 

8.10 

“orig  ”  refers  to  the  un-preconditioned  BiCGSTAB  solver, 
“new”  refers  to  the  preconditioned  BiCGSTAB  solver. 


Table  5.  Time  required  to  solve  the  four  problems 


Compute 
matrix  entries 
(sec) 

Coupling 
Equations  (13)  - 
(21)  (sec) 

Original 

Preconditioned 

Improvement 

(%) 

Solving  Eq. 

(17)  (sec) 

Total  (sec) 

Solving  Eq. 
(29)  (sec) 

Total 

(sec) 

Problem  1 

46.00 

20.76 

2.03 

68.79 

0.12 

66.88 

3% 

Problem  2 

48.00 

40.23 

3.55 

91.78 

221% 

Problem  3 

287.20 

438.60 

6,037.90 

6,763.7 

40.10 

765.90 

783% 

Problem  4 

40.12 

11.33 

386.77 

438.22 

8.10 

59.55 

636% 

the  final  FEM/MoM  matrix  after  permutation 


nz  =  9924 

Figure  13.  Sparsity  pattern  of  Problem  1  A”  in 

Equation  (32)  after  minimum  degree  permutation. 

VI.  THE  INWARD-LOOKING  FORMULATION 

The  inward-looking  formulation  chooses  the 
coefficients  of  the  equivalent  surface  current  expansion  in 
the  exterior  equivalent  problem  (Jh  and  Jc  in  Equation  (3)  ) 
as  the  primary  unknowns  in  the  final  matrix  equation.  This 
formulation  has  been  implemented  by  Jin  and  Liepa  [34], 
Yuan  et  al  [35],  and  Cangellaris  and  Lee  [36]. 

From  Equation  (5),  the  following  derivation  can  be 
made, 


AaEi  +AidEd=  g,  =*  E,  =  A?  (Si  ~AM  (37) 

A iAi  +  AddEd  “  BdhJh  +8d  =* 

(Add  -AdAi1  Aid)Ed  =  BdhJ h  +  {gd  -AdiAdgt) .  (38) 
To  save  computation  time  and  memory,  the  following 


intermediate  terms  are  introduced, 

M, dd  -  (Am  Adj ^  Ajj )  1 

(39) 

* 

III 

•s 

(40) 

Kd  M  (fa  ( g  d  Adi  A^  g  ) . 

(41) 

Equation  (38)  can  be  rewritten  as, 

Ed  =NdhJh+Kd. 

(42) 

Substituting  Equation  (38)  into  Equation  (3)  gives  the  final 
matrix  equation, 


Cu,  ~  Dhd^dh  Ehc  7 h 

1 - 

1 

Q3 

1 _ 

. Ech  ~  Dcd Ndh  Ecc _  Jc_ 

1 

1 

1 _ 

Note  that  the  order  of  this  equation  is  the  same  as  that  of  the 
MoM  matrix.  The  inward-looking  formulation  inverts  one 

sparse  matrix  Aih  and  one  dense  matrix  ( A ^  -  AdiA^Aid) . 
Because  the  matrix  in  Equation  (43)  is  dense,  the  Gaussian 
elimination  method  is  used  to  solve  the  final  matrix 
equation. 

The  outward-looking  formulation  is  better  suited  to 
problems  with  a  large  number  of  FEM  unknowns  and  fewer 
MoM  unknowns.  The  inward-looking  formulation  is 
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preferred  for  problems  with  a  higher  percentage  of  MoM 
unknowns.  Of  the  four  problems  presented  here,  only 
Problem  1  has  more  MoM  unknowns  than  FEM  unknowns. 
As  shown  in  Table  6,  the  inward-looking  formulation  is 
faster  than  the  outward-looking  formulation  at  solving 
Problem  1.  However,  the  inward-looking  formulation  is  not 
the  best  choice  for  the  other  three  problems. 


The  Gaussian  elimination  method  can  also  be  used  to 
solve  Equation  (44).  However,  a  large  number  of  fill-ins  are 
generated  during  Gaussian  elimination.  To  reduce  the 
number  of  fill-ins,  {£/}  in  Equation  (44)  can  be  permuted. 
However,  permuting  { Ed ,  Jd,  Jc }  in  Equation  (44)  destroys 
the  data  locality  of  the  matrix  C  and  D  and  therefore  is  not 
preferred. 


VII.  THE  COMBINED  FORMULATION 


VIII.  COMPARING  THE  THREE  FORMULATIONS 


The  outward-looking  and  inward-looking  formulations 
are  computationally  expensive  because  they  invert  two 
matrices.  An  alternative  is  to  combine  Equation  (3)  and 
Equation  (5)  to  form  the  final  matrix  as  follows, 

■4.  4  0  0 

Afi  A id  ~  &dh  0 

0  ”  ®hd  Cfih  Cfic 

.  0  ~E*cd  Cch  Ccc 

and  solve  for  all  unknowns  simultaneously  [14].  This  is 
referred  to  as  the  combined  formulation  in  this  paper.  It  has 
become  more  popular  recently  and  has  been  employed  by 

Sheng  etal  [18],  Jankovic  etal  [37],  and  Shen  etal  [38]. 

The  combined  formulation  does  not  require  any 
matrix  inversions.  However,  it  generates  a  larger  matrix 
equation.  The  order  of  the  final  matrix  is  equal  to  the  sum  of 
the  orders  of  the  FEM  and  MoM  matrices.  As  shown  in 
Table  7,  the  matrix  in  the  combined  formulation  has  a  much 
larger  condition  number  than  the  final  matrix  in  the 
outward-looking  formulation.  Due  to  these  large  condition 
numbers,  it  can  be  very  difficult  to  generate  preconditioners 
using  LU  factorization  or  other  preconditioning  techniques. 
Consequently,  iterative  methods  may  not  work  well, 
especially  when  the  MoM  part  is  large.  Table  8  lists  the 
normalized  residue  of  the  solutions  to  Equation  (44)  for  the 
four  sample  problems  using  the  Bi-Conjugate  Gradient 
(BiCG),  BiCGSTAB  and  Generalized  Minimal  Residual 
(GMRES)  methods  [23].  None  of  them  reaches  the 
designated  drop  tolerance  of  LOxlO'3.  Problem  2,  which 
has  a  different  geometry  (a  sphere)  from  the  other  three 
PCB  problems,  has  a  much  smaller  condition  number  and 
two  of  the  iterative  solvers  converge  to  acceptable  residues. 
This  may  explain  why  the  authors  in  [18],  [37]  did  not 
report  convergence  problems  for  the  combined  formulation. 
Shen  et  al  [38]  showed  that  the  ELU  factorization  with 
different  fill-in  levels  worked  very  well  for  their 
applications.  However,  the  problems  presented  in  [38]  have 
a  large  number  of  FEM  unknowns  (>16,000)  and  very  few 
MoM  unknowns  (<200).  The  four  sample  problems 
presented  in  this  paper  have  a  higher  percentage  of  MoM 
unknowns  because  the  MoM  boundary  is  applied  closer  to 
the  object  being  modeled.  For  the  four  sample  problems 
presented  here,  the  ILU  factorization  technique  fails  to 
converge. 


Ei 

Si 

Ed 

Sd 

h 

-Eh 

Jc. 

-Fc. 

Table  9  lists  the  time  required  using  each  of  the  three 
formulations  to  solve  the  sample  problems.  The  outward¬ 
looking  formulation  inverts  two  dense  matrices  and 
performs  a  lot  of  matrix  multiplication.  However,  this 
formulation  is  excellent  when  the  coupling  index  p  is  large, 
mainly  because  the  preconditioning  technique  presented  in 
Section  III  greatly  reduces  the  time  spent  solving  the  final 
matrix  equation.  The  inward-looking  formulation  excels 
when  the  coupling  index  p  is  small.  It  performs  poorly  when 
p  is  large  because  the  inverse  of  the  sparse  FEM  matrix  is 
dense  and  the  coupling  process  is  time-consuming.  The 
combined  formulation  was  not  optimum  for  any  of  the 
sample  problems  although  it  worked  reasonably  well  for 
solving  Problem  1  and  Problem  2. 

Table  10  lists  the  computer  memory  requirements  for 
each  of  the  three  formulations.  The  outward-looking 
formulation  required  the  least  amount  of  memory.  One 
reason  for  this  was  that  BiCGSTAB  was  used  to  solve  the 
final  equation  and  the  FEM  matrix  was  stored  as  a  sparse 
matrix.  Another  reason  was  that  the  symmetric  minimum 
degree  permutation  significantly  reduced  the  number  of  fill- 
ins  when  constructing  preconditioners.  For  the  in  ward¬ 
looking  formulation,  the  inverse  of  the  FEM  matrix  and  the 
matrix  in  Equation  (43)  were  dense,  so  this  formulation 
required  more  memory  than  the  outward-looking 
formulation.  The  inward-looking  formulation  required  less 
or  more  memory  than  the  combined  formulation,  depending 
on  the  value  of  p.  The  combined  formulation  required  much 
more  memory  than  the  outward-looking  formulation 
because  the  Gaussian  elimination  method  was  used  to  solve 
the  matrix  equation.  The  exact  amount  of  time  and  memory 
required  to  solve  a  problem  depends  on  many  factors  such 
as  the  mesh  quality,  the  order  of  {Eh  Ed, ,  Jh,  7C},  and  the 
convergence  rate  of  iterative  solvers.  The  coupling  index  p 
can  be  used  as  a  rough  measure  to  determine  which 
formulation  is  preferred.  Based  on  the  four  sample  problems 
and  the  authors’  experience,  the  outward-looking 
formulation  is  preferred  when  p>2.0;  the  inward-looking 
formulation  is  preferred  when  pel. 5.  The  combined 
formulation  is  not  preferred  due  to  its  large  memory 
requirement  (when  using  a  Gaussian  elimination  solver), 
and  its  poor  convergence  rate  (when  using  an  iterative 
solver).  The  combined  formulation  is  acceptable  when  the 
problem  is  not  memory-constrained. 

Depending  on  the  type  of  problems  being  solved,  the 
three  formulations  may  exhibit  instability  problems.  As 
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pointed  out  by  Pearson  et  al  [13]  and  Peterson  et  al  [14],  exist  or  is  nearly  singular  at  resonant  frequencies.  However, 
the  inward-looking  formulation  is  susceptible  to  uniqueness  typical  EMC  problems  that  model  the  high-frequency  loss 

difficulties.  As  shown  in  Equation  (37),  A~'  must  be  present  in  the  problem  geometries  are  not  likely  to  exhibit 

.  this  instability, 

computed.  A#  is  essentially  the  FEM  matrix  for  a  closed 

cavity  that  might  be  resonant,  which  means  A^1  does  not 


Table  6.  Comparison  between  the  outward-looking  and  inward-looking  formulations 


Compute 

matrix 

entries 

(sec) 

Outward-looking 
(preconditioned  BiCGSTAB) 

Inward-looking 
(Gaussian  elimination) 

Coupling 
Equations  (13)  - 
(21)  (sec) 

Solving 

Equation  (32) 
(sec) 

Total 

(sec) 

Coupling 
Equations  (37)  - 
(43)  (sec) 

Solving 
Equation  (43) 
(sec) 

Total 

(sec) 

Problem  1 

46.00 

20.76 

0.12 

66.88 

1.63 

11.19 

58.82 

Problem  2 

48.00 

40.23 

3.55 

91.78 

46.83 

8.80 

103.60 

Problem  4 

40.12 

11.33 

8.10 

59.55 

174.92 

4.89 

219.90 

Problem  3  is  not  listed  in  this  table  because  the  inward-looking  formulation  requires  excessive  computer  memory. 


Table  7.  The  condition  number  for  the  outward-looking  and  combined  formulations  without  preconditioning 


K  (LHS*")  (Outward-looking) 

tf(LHS")  (Combined) 

Problem  1 

8.32xl06 

4.38  xlO10 

Problem  2 

2.87X103 

2.71X107 

Problem  3 

4.27xl07 

3.81X1011 

Problem  4 

5.56X107 

1.78X1011 

LHS  refers  to  the  matrix  on  the  left-hand  side  of  Equation  (26). 
LHS  refers  to  the  matrix  on  the  left-hand  side  of  Equation  (44). 


Table  8.  Solutions  to  Equation  (44)  using  iterative  solvers  without  preconditioning 
(The  drop  tolerance  was  l.OxlO'3;  the  maximum  iteration  number  was  set  to  be  the  size  of  the  matrix  equation.) 


Normalized  least  residue 

Problem  1 

Problem  2 

Problem  3 

Problem  4 

BiCG 

0.66 

0.89 

0.60 

0.50 

BiCGSTAB 

0.34 

0.0058 

0.19 

0.41 

GMRES(5)  * 

0.31 

0.0049 

0.19 

0.39 

GMRES  restarted  after  every  five  search-directions. 


Table  9.  Time  required  by  the  three  formulations 


P 

Outward-looking*  (sec) 

Inward-looking  (sec) 

Combined  (sec) 

Problem  1 

0.74 

66.88 

58.82 

59.81 

Problem  2 

2.12 

91.78 

103.60 

92.66 

Problem  3 

3.43 

765.90 

N/A** 

waT 

Problem  4 

5.32 

59.55 

219.90 

76.59 

The  drop  tolerance  for  the  BiCGSTAB  solver  is  l.OxlO*3. 

The  results  are  not  available  due  to  excessive  memory  requirements. 


Table  10.  Computer  memory  rec 

uirements  of  the  three  formulations 

p 

Outward-looking  (MBytes) 

Inward-looking  (MBytes) 

Combined  (MBytes) 

Problem  1 

0.74 

7 

17 

34 

Problem  2 

2.12 

23 

42 

70 

Problem  3 

3.43 

107 

N/A* 

N/A* 

Problem  4 

5.32 

11 

126 

36 

Data  not  available  due  to  excessive  memory  requirements 
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The  outward-looking  and  the  combined  formulations 
do  not  have  the  uniqueness  problem  associated  with  A^1 . 
However,  all  three  formulations  may  have  uniqueness 
difficulties  at  interior  resonant  frequencies  caused  by  the 
EFIE  [15],  [17],  [18].  The  exterior  equivalent  problem 
can  be  constructed  in  a  manner  (e.g.  using  a  combined 
field  formulation  [6],  [18]),  to  avoid  the  problem  of 
interior  resonance. 

IX.  CONCLUSIONS 

This  paper  presents  three  formulations  for  the  hybrid 
FEM/MoM  method.  The  outward-looking  formulation 
constructs  an  RBC  using  MoM  and  then  substitutes  the 
RBC  into  the  FEM  equations.  Iterative  solvers  can  be 
used  to  solve  the  final  matrix  equation  efficiently.  The 
authors  have  found  that  it  is  much  faster  and  less  memory 
intensive,  to  construct  preconditioners  based  on  LU 
factorization  of  the  FEM  matrix  rather  than  the  final 
matrix.  The  symmetric  minimum  degree  permutation  can 
reduce  the  number  of  fill-ins  resulting  in  further  memory 
reduction.  The  preconditioning  technique  presented 
greatly  reduced  the  number  of  iterations  required  by  the 
solver  for  the  sample  problems  presented  here.  The 
outward-looking  formulation  is  preferred  when  the 
coupling  index  p  is  larger  than  2.0. 

The  inward-looking  formulation  derives  an  RBC 
using  the  FEM,  then  substitutes  the  RBC  into  the  MoM 
equations.  The  Gaussian  elimination  method  is  generally 
used  to  solve  the  final  matrix  equation.  The  inward¬ 
looking  formulation  is  preferred  when  the  coupling  index 
p  is  smaller  than  1.5. 

The  combined  formulation  generates  a  large  matrix 
equation  directly  without  inverting  any  matrices,  and 
solves  for  all  unknowns  simultaneously.  For  the  types  of 
problems  studied  here,  it  was  difficult  to  apply  iterative 
solvers  to  the  resulting  matrix  equations  due  to  their  large 
condition  numbers. 

The  choice  of  hybrid  FEM/MoM  formulation 
depends  on  the  problem  geometry  and  the  way  it  is 
meshed.  However,  for  the  printed  circuit  board 
geometries  investigated  in  this  paper,  the  outward-looking 
formulation  appears  to  be  the  most  effective  and  most 
efficient  approach. 
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ABSTRACT  -  This  paper  presents  a  new  excitation  model 
for  probe-fed  printed  antennas  on  both  infinite  and  finite 
size  ground  planes.  The  model  has  been  developed  within 
the  general  frame  of  the  mixed  potential  integral  equation 
(MPIE)  and  the  method  of  moments  (MoM).  The  technique 
is  based  on  a  delta-gap  voltage  model  and  a  special  pro¬ 
cedure  is  implemented  inside  the  integral  equation  to  ef¬ 
fectively  impose  a  voltage  reference  plane  into  a  floating 
metallic  plate  which  is  acting  as  a  ground  plane.  The 
present  technique  allows  the  accurate  calculation  of  the  in¬ 
put  impedance  of printed  antennas ,  and  the  effects  of  finite 
size  ground  planes  can  be  easily  accounted for  in  the  calcu¬ 
lations.  In  addition ,  an  efficient  technique  is  presented  for 
the  evaluation  of  the  radiation  patterns  of  printed  antennas, 
taking  also  into  account  the  presence  of  finite  size  ground 
planes.  Comparisons  with  measured  results  show  that  the 
new  derived  excitation  method  is  indeed  accurate,  and  can 
be  used  for  the  prediction  of  the  backside  radiation  and  side 
lobe  levels  of  real  life  finite  ground  plane  printed  antennas. 
Keywords. —  Integral  equation,  excitation  models,  finite 
ground  plane,  backside  radiation,  printed  antennas. 

1  INTRODUCTION 

During  the  last  decades,  printed  circuits  and  antennas  have 
played  an  important  role  in  many  branches  of  electrical 
engineering  and  the  field  of  application  is  spreading  to  new 
technologies  and  to  even  higher  frequencies.  The  need 
for  miniaturisation  is  increasing  in  many  applications  e.g., 
tel  ^communications  and  space  missions.  Obviously,  these 
compact  geometries  are  not  adequate  for  the  use  of  models 
assuming  infinite  ground  planes. 

The  need  to  take  into  account  for  finite  ground  plane  dimen¬ 


sions  in  microstrip  antennas  modelling  arises  especially 
in  applications  where  patches  are  used  as  free  standing 
structures  and  front-to-back  ratio  must  be  maximized  in 
order  to  avoid  interference  problems  [Bokhari  et  al.  1992], 
or  to  locate  a  potential  main  beam  deformation  caused 
by  the  diffraction  from  the  ground  plane  edges.  More¬ 
over,  the  need  to  model  the  excitation  on  two  floating 
metallic  patches  can  become  inevitable  in  applications 
like  dual  band  stacked  printed  antennas  where  a  first 
patch  acts  as  ground  plane  for  a  second  radiating  element 
[Ziircher  etal.  1999]. 

To  solve  this  problem  a  new  excitation  model 
and  de-embedding  technique  for  the  computa¬ 
tion  of  the  input  impedance  of  probe-fed  printed 
antennas  on  finite  size  ground  planes  using  a 
Mixed  Potential  Integral  Equation  technique  (MPIE) 
[Mosig  and  Gardiol  1988,  Hall  and  Mosig  1996]  has  been 
developed.  This  approach  accounts  for  the  effect  of  the 
ground  plane  dimensions  on  the  input  impedance,  the 
mutual  coupling,  and  the  radiation  characteristics  of  a 
single  antenna  element  or  a  finite  array. 

As  a  first  step  to  attain  this  goal,  a  new  attachment  mode 
for  probe-fed  printed  antennas  on  infinite  ground  plane 
has  been  developed.  The  most  widely  used  excitation 
model  for  probe-fed  antennas  is  the  impressed-current 
model  [Pozar  1982,  Hall  and  Mosig  1989].  This  model 
assumes  that  a  constant  impressed  current  is  exciting  the 
antenna  and  it  use  the  derived  distribution  of  currents 
on  metallic  surfaces  to  compute  the  voltage  at  the  probe 
location.  This  method  may  lead  to  accurate  results  but 
needs  the  computation  of  a  surface  integral  over  all  the 
metallic  surfaces  present  in  the  structure  to  obtain  the 
input  impedance.  Contrary  to  the  previous  one,  the  model 
presented  here,  as  described  in  Sec.  3,  uses  a  delta  gap 
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voltage  excitation  model  (to  the  authors’  knowledge 
used  until  now  only  for  microstrip  line  fed  antennas 
[Davidovitz  and  Lo  1989,  Harokopus  and  Katehi  1991, 
Eleftheriades  and  Mosig  1996]).  This  model  assumes  an 
impressed  voltage  between  the  antenna  and  the  ground 
plane  and,  once  the  surface  currents  have  been  computed, 
only  a  normalisation  by  the  excitation  voltage  is  needed 
to  obtain  the  input  admittance.  Another  remarkable  dif¬ 
ference  between  the  two  models  is  the  type  of  special  basis 
functions  used  in  the  attachment  mode.  Considering  the 
case  of  triangular  meshing  (the  extension  to  rectangular 
cells  is  straightforward),  in  the  impressed  current  model 
one  (or  more)  entire  basis  function  with  opposite  sign  of 
the  current  on  its  two  halves  is  used  to  model  the  hori¬ 
zontal  spreading  of  the  vertical  current  coming  from  the 
coaxial  probe.  In  the  present  model,  one  to  three  half  basis 
functions  are  introduced  for  the  attachment  mode  depnding 
on  the  location  of  the  feed.  This  implies  that  the  present 
excitation  model  can  be  used  for  any  probe  location  inside 
the  patch,  including  its  edge  and  also  for  microstrip  line  fed 
antennas  [Tiezzi  et  al.  1999]  without  exception. 

These  excitation  models  as  well  as  the  subsequent  tech¬ 
nique  for  computing  impedances  are  implicitly  based  on  the 
assumption  of  an  infinite  ground  plane,  which  according 
to  image  theory  automatically  produces  a  zero  voltage 
at  ground  plane  level.  In  Sec.  4  the  attachment  mode  is 
modified  in  order  to  take  into  account  the  finiteness  of  the 
ground  plane.  Here,  instead  of  using  Green’s  functions 
including  the  ground  plane  effect  through  image  theory,  a 
specific  numerical  treatment  is  applied  to  the  ground  plane. 
To  the  authors’  knowledge,  the  first  approach  using  an 
MPIE  formulation  for  the  study  of  finite  size  ground 
planes  can  be  found  in  [Bokhari  etal  1992].  This  work, 
however,  only  represents  an  approximation  of  the  real  finite 
structure,  since  the  currents  induced  on  the  antenna  are 
computed  using  an  infinite  ground  plane  model.  Once  the 
induced  currents  are  computed,  the  finite  size  nature  of 
the  ground  plane  is  taken  into  account,  at  a  later  stage, 
during  the  calculation  of  the  scattering  problem  asso¬ 
ciated  with  the  computed  currents.  Hence  the  results 
presented  in  [Bokhari  et  al  1992]  are  only  accurate,  if  the 
ground  plane  is  sufficiently  large:  it  would  therefore  be 
desirable  to  develop  a  rigorous  method,  which  remains 
valid  even  for  very  small  ground  planes.  The  method 
presented  in  this  paper  is  a  full  wave  method  based  on  the 
MPIE  technique,  and  the  only  approximation  introduced 


is  that  we  use  the  Green’s  functions  multilayered  media 
formulated  in  the  traditional  form  of  Sommerfeld  integrals 
[Mosig  and  Gardiol  1988,  Mosig  1989].  Therefore  the 
currents  induced  in  the  structure  are  computed  taking  into 
account  since  the  beginning  the  finite  size  of  the  ground 
plane  but  the  second-order  effect  of  dielectric  truncation 
is  neglected.  This  approximation  has  been  introduced 
to  maximize  the  numerical  efficiency  and  its  accuracy 
is  confirmed  by  our  results.  In  addition  to  being  more 
rigorous,  another  advantage  of  this  approach  is  that  the 
effects  on  the  input  impedance  of  the  finite  size  ground 
planes  can  accurately  be  evaluated  and  moreover  scattering 
from  ground  plane  edges  can  be  taken  into  account.  Thus 
full  range  (including  backside  scattering)  radiation  patterns 
can  also  be  predicted. 

2  BACKGROUND  AND  STATEMENT  OF  THE 
PROBLEM 

The  new  excitation  model  presented  in  this  paper  has  been 
developed  in  the  frame  of  the  analysis  of  multilayered 
printed  circuits  and  antennas  following  the  MPIE  formula¬ 
tion  [Mosig  and  Gardiol  1988].  The  generic  structure  under 
analysis  is  presented  in  Fig.  1.  As  shown,  it  is  composed 
by  one  or  more  conducting  patches  embedded  on  a  strati¬ 
fied  medium.  Either  a  perfect  conductor  ground  plane  or  a 
free  space  layer  extending  to  z  =  — oo  can  be  placed  at 
the  bottom  of  the  structure.  Each  dielectric  layer,  which 
may  be  lossy,  is  assumed  to  be  homogeneous,  isotropic 
and  transversally  infinite.  The  conducting  patches  are  as¬ 
sumed  to  have  finite  transverse  size,  arbitrary  shape,  negli¬ 
gible  thickness  and  an  infinite  conductivity,  although  finite 
conductivity  can  easily  be  taken  into  account  using  Leon- 
tovich  boundary  conditions  [Mosig  and  Gardiol  1985]. 
Under  these  assumptions  the  boundary  condition  for  the 
electric  field  on  the  surface  of  the  conducting  strips  is  writ¬ 
ten  as 

ez  x  (Ee  +  Es)  =  0  (1) 

where  Ee  and  Es  are  respectively  the  excitation  and  the 
scattered  electric  field. 

The  scattered  field  is  expressed  in  terms  of  the  vector  and 
scalar  potential  A  and  V  as 

Es  =  -juA-VV  H=-VxA  (2) 
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(a)  Multilayered  medium.  (b)  Equivalent  network. 

Figure  1:  Generic  multilayered  structure  containing  an  arbitrary  number  of  finite  metallizations. 


with  the  potentials  related  by  the  Lorentz  gauge 
[Mosig  1989] 

jwfjieV  +  V  •  A  =  0  (3) 

The  vector  and  scalar  potentials  A,V  can  in  turn  be 
expressed  in  terms  of  superposition  integrals  of  the  corre¬ 
sponding  Green’s  functions  Ga ?  Gv  weighted  by  the  un¬ 
known  distribution  of  surface  current  and  electric  charge 
Jsi  Ps  ^ 

A  J  GA(r\r')»Js(r')  dS' 

V  =  J  Gv(r\r')  p3(r')  dS'  (4) 

and  finally,  using  the  continuity  equation  to  express  the 
electric  charge  in  terms  of  current,  the  boundary  condition 
in  equation  (1)  becomes 

e2  x  Ee  =  ezx  (ju}J^GA(r\r')  •  J3(r')  dS' 

+  V  [  Gv{r\r')  V  •  Js(r')  dS')  (5) 
ju  Js  ) 

which  is  the  basic  integral  equation  to  be  solved  to  find  the 
unknown  distribution  of  surface  currents. 

The  multilayered  media  Green’s  functions  appearing  in 
equation  (5)  are  derived,  in  the  spectral  domain, 
from  the  equivalent  transmission  line  circuit  shown 
in  Fig.  1(b),  as  described  in  [Mosig  and  Gardiol  1988, 
Michalski  and  Mosig  1997].  Furthermore,  these  Green’s 


functions  are  calculated  in  the  spatial  domain  using  spe¬ 
cial  numerical  methods  for  the  evaluation  of  the  Som- 
merfeld  integral,  as  extensively  described  in  [Mosig  1989, 
Alvarez-Melcon  and  Mosig  1996]. 

The  previous  integral  equation  (5)  is  solved  by  the  Method 
of  Moments.  The  conducting  patches  are  segmented  into 
triangular  cells  and  triangular  rooftops  [Rao  etal  1982] 
are  used  as  basis  and  test  functions,  applying  a  Galerkin 
method.  If  coaxial  excitation  is  used,  modified  basis 
functions  are  introduced  at  the  coaxial  pin  location  in  or¬ 
der  to  model  the  spread  on  the  patch  of  the  current  flowing 
on  the  vertical  pin. 

3  A  NEW  ATTACHMENT  MODE 

A  special  set  of  basis  functions,  called  the  attachment  mode, 
is  used  to  ensure  the  continuity  of  the  current  between 
the  coaxial  probe  and  the  antenna.  In  the  present  ap¬ 
proach  the  attachment  mode  is  derived  directly  from  the 
delta-gap  voltage  excitation  model  used  on  microstrip  lines 
[Elefiheriades  and  Mosig  1996].  As  shown  in  fig.  2  an  ef¬ 
ficient  excitation  model  is  obtained  for  the  microstrip  case 
applying  a  voltage  source  of  magnitude  Vm  between  an  in¬ 
finitesimally  small  gap  of  length  5  — >  0  across  the  feeding 
line  and  the  ground  plane.  The  flow  of  induced  currents 
through  the  edge  of  the  microstrip  line  is  modeled  intro¬ 
ducing  one  or  more  half  subsectional  basis  functions  (half 
rooftop  in  the  present  case)  as  shown  in  Figs.  2a,  2c,  and  2d 
[Eleftheriades  and  Mosig  1996]. 
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Figure  2:  A  delta-gap  voltage  source  exciting  printed  circuits,  a)  Colinear  transition  between  a  coaxial  probe  and  a  microstrip  line. 

b)  perpendicular  transition  between  a  coaxial  probe  and  a  microstrip  line,  c)  Delta-Gap  voltage  model  applyed  to  a  coaxial 
probe-fed  microstrip  line,  d)  Associated  MoM  description  of  the  excitation  model,  e)  Coaxial  probe-fed  patch  antenna,  f) 
Associated  MoM  description  of  the  of  the  excitation  of  a  probe- fed  patch  antenna 


It  is  well  known  that  at  least  for  electrically  thin  dielectrics, 
no  difference  in  the  measurement  can  be  noticed  when  the 
microstrip  line  is  fed  by  a  vertical  coaxial  probe  (Fig.  2b), 
so  it  can  be  affirmed  that  the  previous  delta-gap  excitation 
model  is  still  valid  in  this  case.  The  next  step  is  to  apply  the 
same  method  to  a  point  located  inside  the  patch  (see  Fig. 
2e)  having  in  mind  that  current  can  spread  in  any  direction. 
This  behaviour  can  be  obtained  introducing  3  (or  less  if  the 
feed  is  close  to  the  edge)  new  half  rooftops,  one  for  each 
edge  of  the  triangle  containing  the  feeding  point,  which  are 
superimposed  to  the  halves  of  the  standard  rooftops  already 
attached  to  the  triangle  (see  Fig.  2f).  It  must  be  stressed  that 
at  this  point  six  half  rooftops  (one  couple  for  each  side)  are 
present  in  the  triangle,  but  only  three  of  them  are  involved  in 
the  attachment  mode  and  they  are  attached  to  three  virtual 
vertical  half  rooftops,  while  the  other  three  are  connected  to 
the  halves  located  in  the  adjacent  triangles  to  form  standard 
“planar”  basis  function.  It  is  also  important  to  point  out  that 
to  reach  a  good  model  of  the  physical  excitation,  the  area  of 
the  triangle  with  the  attachment  mode  must  be  reasonably 
small,  the  lower  limit  being  imposed  by  the  section  of  the 
internal  conductor  of  the  coaxial  cable. 

The  application  of  the  Method  of  Moments  (MoM)  to  solve 
the  integral  equation  (5)  leads  to  a  system  of  linear  equa¬ 
tions  that  can  be  shortly  expressed  as 


Figure  3:  Basic  geometry  of  a  probe-fed  printed  antenna  used  in 
the  formulation  of  the  excitation  model. 

Nf 

e*  =  ^ a*  Pi,k  ,  i  =  1,2, ,Nf  (6) 
k=  1 

where  Pitk  is  the  i,  fc-th  term  of  the  moments  matrix,  is 
the  fc-th  term  of  the  unknown  electric  current  density  vector, 
Nf  is  the  total  number  of  basis  functions  and  e*  is  the  i-th 
term  of  the  excitations  vector.  The  latter  is  defined  as 

Ci=  [  Ee*fi{r)  ds  (7) 

Js 

where  Ee  represents  the  impressed  electric  field,  and  fi(r) 
is  the  subsectional  testing  functions  of  the  MoM.  The  un- 
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knowns  electric  currents  can  now  be  expanded  as 
Ns 

Js  =  £afc/fc(r  0  (8) 

fc=l 

where  Nf  is  the  total  number  of  basis  functions,  and  are 
the  unknown  coefficients  in  the  expansion. 

With  reference  to  the  port  geometry  shown  in  Fig.  3,  we 
apply  the  delta-gap  model  only  to  the  three  “half*  basis 
functions  of  the  attachment  mode,  which  allows  us  to  write 
the  excitation  field  created  by  the  voltage  source  as 

3 

Ee  =  VmJ26(r-?p)np  (9) 

V=  1 

where  rp,  p  =  1, 2, 3,  denotes  the  position  vector  of  the 
three  edge  associated  to  the  port.  Substituting  equation  (9) 
in  equation  (7)  we  obtain 


which  is  an  integral  with  an  easily  obtained  analytical  solu¬ 
tion,  we  can  introduce  (1 1)  in  (6)  and  obtain  the  following 
system  of  linear  equations 

Nf 

Vmj i  =  X>fePi,jfe,  *  =  1,2,-->JV>  (13) 

k=l 

The  solution  of  this  system  of  linear  equations  gives  the 


Figure  5:  Comparison  of  measured  and  computed  results  of  the 
input  impedance  of  the  antenna  in  Fig,  4.  □  measure, 
+  theory,  (increment  5  MHz  clockwise,  measurement 
reproduced  from  [James  and  Hall  1989]) 


Figure  4;  Probe-fed  patch  antenna  on  an  infinite  ground  plane. 

Substrate:  REXOLITE  2200,  h  =  1.59  mm,  er  = 
2.62,  tan$  w  0.002. 


€i=Vn 


I  [£5(f' 

Js  |±1 


rP)  rip  •  fl(r) 


ds 


(10) 


values  of  the  unknown  coefficients  a*.  These  can  then  be 
used  to  compute  the  current  Im  flowing  through  the  port  as 
follows 

'  3 

Js(rp)  •  ( fop )  dl 


Using  the  integration  properties  of  the  Dirac  delta  function 
and  defining  /^p(r)  =  ftp  •  fi(f)  as  the  component  of  the 
basis  function  perpendicular  to  p-th  triangle’s  edge,  equa¬ 
tion  (10)  reduces  to 

ei  =  V™£  [S /?”(*>)  dl  (11) 

where  C  is  the  perimeter  of  the  triangle  with  the  attachment 
mode  (see  Fig.  3).  Defining  now 

3 

'  dl  (12) 

L F?  1 


=  I 

=  X>  /  E/fcftp(^) 

fcj  •'c  Lp^i 


dl 


Ns 

=  y  o=fc  t k 

k= 1 


(14) 


7*= /  \yy(rP) 

Lp=i  J 


From  equation  (14)  the  input  impedance  of  the  circuit  is  di¬ 
rectly  obtained  by  dividing  both  the  terms  of  the  equation 
by  the  exciting  voltage  Vm,  and  then  by  inverting  the  result¬ 
ing  input  admittance,  namely: 


Nf 


Zir 


1  v  _  im  _ 

V,-  Yin  Vm  ^ 


fc= 1 


Qfe  li 

Vm 


(15) 
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To  verify  the  validity  of  the  derived  model  we  have  analysed 
the  basic  probe-fed  printed  patch  antenna  presented  in 
[James  and  Hall  1989].  For  simplicity  the  geometry  of  the 
antenna  is  reported  in  Fig.  4.  The  input  impedance  of  the 
antenna  has  been  measured  for  the  fundamental  ( TMiq ) 
mode  and  for  three  different  placements  of  the  feed  (see 
Fig.  4).  The  comparison  between  the  measurement  and  the 
computed  results,  presented  in  fig.  5,  show  the  accuracy 
achieved  with  the  present  model. 

4  ANALYSIS  OF  PROBE-FED  PATCH 
ANTENNAS  ON  FINITE  SIZE  GROUND 
PLANES 

In  this  section  we  describe  how  the  excitation  model  pre¬ 
sented  in  the  previous  section  must  be  modified  in  order 
to  take  into  account  the  finiteness  of  the  ground  plane. 
The  study  is  presented  for  a  simple  printed  patch  an¬ 
tenna,  but  the  extension  to  more  complicated  structure  is 
straightforward.  An  important  difference  between  the  ana¬ 
lysis  presented  in  the  present  paper  and  traditional  analy¬ 
sis  like  the  one  performed  in  the  previous  section  (see  also 
[Bunger  and  Arndt  1997]),  is  that  in  the  present  case  the 
Green’s  functions  derived  do  not  take  into  account  infinite 
ground  planes,  and  therefore,  all  metallizations  are  consi¬ 
dered  to  be  finite.  The  main  difficulty  in  doing  this  is  that 
the  condition  of  null  potential  at  the  ground  plane  is  not 
automatically  imposed  by  the  Green’s  functions.  As  a  con¬ 
sequence,  now  the  finite  ground  plane  must  be  introduced 
inside  the  integral  equation  to  enforce  the  proper  boundary 
conditions  on  it,  and  the  currents  induced  on  this  reference 
ground  plane  must  also  be  computed.  Also,  a  new  excita¬ 
tion  model  and  de-embedding  technique  must  be  derived  to 
be  able  to  extract  the  actual  input  impedance  of  the  antenna 
when  such  floating  grounds  are  considered  as  references. 
This  is  mainly  due  to  the  fact  that  the  ground  plane  is  no 
longer  acting  as  an  automatic  reference  plane  for  the  ge¬ 
nerator,  so  that  the  reference  condition  of  the  finite  ground 
plane  must  be  introduced  explicitly  in  the  model. 

The  advantages  of  such  finite  ground  plane  models  are 
clear.  First,  the  effects  of  a  finite  size  ground  plane  on 
the  input  impedance  of  antennas  can  be  accurately  taken 
into  account.  Secondly,  the  diffraction  of  the  radiated 
field  on  the  edges  of  finite  size  ground  planes  can  also  be 
studied.  This  will  give  an  idea  of  the  back-radiation  of 


Figure  6:  Probe-fed  patch  antenna  on  an  finite  size  ground  plane. 


microstrip  antennas,  including  the  side-lobe  levels  which 
might  be  expected  in  their  radiation  patterns.  Both  ele¬ 
ments  are  of  key  importance  in  the  design  of  antennas,  and 
up  to  now  they  could  only  be  evaluated  through  measure¬ 
ments,  or  with  lengthy  numerical  calculations  using  tech¬ 
niques  such  as  the  finite  elements  or  the  finite  differences 
[Ciampolini  et  al  1996]. 

Let  us  now  consider  the  basic  microstrip  antenna  with  finite 
size  ground  plane  represented  in  Fig.  6.  Opposite  to  the  case 
of  an  infinite  ground  plane,  where  the  excitation  is  injected 
only  through  the  patch  while  the  ground  plane  is  included  in 
the  Green’s  functions,  the  model  must  be  modified  in  the  fi¬ 
nite  ground  case  so  that  the  finite  ground  plane  is  connected 
to  the  generator  and  surface  currents  must  be  free  to  flow 
through  this  connection.  This  is  obtained  by  using  a  “mir¬ 
ror”  attachment  model  in  the  ground  plane  with  the  sign 
of  the  currents  reversed.  Also,  the  potential  of  the  ground 
plane  is  set  to  zero  by  means  of  a  numerical  treatment  acting 
on  the  MPIE  formulation.  Fig.  7  presents  the  basic  idea  of 
the  extended  attachment  mode.  If  we  take  again  the  case 
of  a  transition  from  a  coaxial  cable  to  a  microstrip  line,  but 
where  the  size  of  the  microstrip’s  ground  plane  is  now  fi¬ 
nite  (Fig.  7a),  the  equivalent  excitation  model  can  be  repre¬ 
sented  with  a  voltage  generator  connected  to  the  microstrip 
line  as  in  the  previous  case,  but  with  the  grounded  termi¬ 
nal  now  connected  to  the  physical  ground  plane  (Fig.  7b). 
As  depicted  in  the  figure,  the  currents  flowing  through  the 
two  terminals  of  the  generator  must  be  the  same.  There¬ 
fore  the  same  “spreading”  behavior  of  the  current  must  be 
imposed  in  both  the  microstrip  patch  and  the  ground  plane. 
This  behaviour  can  be  obtained  in  the  MoM  implementa¬ 
tion  by  introducing  one  half  basis  function  on  the  ground 
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Figure  7:  Attachment  mode  for  patch  antennas  on  finite  ground  planes,  a)  Colinear  transition  between  a  coaxial  probe  and  a  microstrip 
line,  b)  Delta-Gap  voltage  model  applyed  to  a  coaxial  probe-fed  microstrip  line,  c)  Associated  MoM  description  of  the 
excitation  model,  d)  Coaxial  probe-fed  patch  antenna,  e)  Associated  MoM  description  of  the  of  the  excitation  of  a  probe-fed 
patch  antenna 


plane  for  each  of  these  present  in  the  microstrip  and  linking 
the  two  halves  together  to  form  an  entire  basis  function  (see 
Fig.  7c),  i.e.  only  one  unknown  term  for  each  couple  is 
present  in  the  MoM  matrix  [Tiezzi  etal  1999].  This  im¬ 
plies  that  the  free  edges  of  the  two  half  basis  function  must 
have  the  same  length.  Applying  now  the  same  scheme  to 
the  probe-fed  patch  antenna  represented  in  Fig.  7d,  starting 
from  the  attachment  mode  sketched  in  Fig.  2e,  we  obtain 
the  new  attachment  mode  composed  by  three  (or  less)  half 
basis  functions  on  the  patch  and  the  same  number  of  half 
basis  functions  with  opposite  sign  on  the  ground  plane. 

To  demonstrate  the  effectiveness  of  the  derived  model,  the 
antenna  in  Fig.  4  has  been  simulated  with  a  ground  plane  of 
width  Wg  =  214  mm  and  length  Lg  =  214  mm  for  again 
three  position  of  the  coaxial  excitation.  The  agreement 
between  theory  and  measurement  (Fig.  8)is  rather  good. 
Indeed  our  model  can  work  for  any  size  of  ground  plane 
from  the  completely  unbalanced  antenna  (infinite  ground 
plane)  to  a  perfectly  balanced  antenna  (ground  plane  having 
the  patch’s  size).  The  latter  case  has  been  tested  for  an 
antenna  on  a  RT/DUROID  5870  substrate  with  thickness 
h  =  1.57  mm  and  relative  dielectric  constant  er  =  2.33. 
With  respect  to  Fig.  6  the  dimensions  of  the  antenna  are 
Wp  =  Wg  =  120.1  mm,  Lp  =  Lg  =  79.5  mm,  Xp  =  60 
mm,  Yp  =  29  mm.  The  results  are  presented  in  Fig.  9.  The 
agreement  between  measured  and  computed  results  is  ex- 


Figure  8:  Measured  versus  simulated  results  for  the  patch  an¬ 
tenna  shown  in  Fig.  4,  when  the  new  excitation  model 
is  used:  Wg  =  214  mm,  Lg  =  214  mm.  (incre¬ 
ment  5  MHz  clockwise,  measurement  reproduced  from 
[James  and  Hall  1989]) 
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Figure  9:  Measured  versus  simulated  results  for  a  perfectly  bal¬ 
anced  patch  antenna:  Wp  =  Wg  =  120.1  mm,  Lp  = 
Lg  =  79.5  mm,  Lp  —  Lg  —  80  mm,  Xp  =  60  mm, 
Yp  =  29  mm.  Substrate:  RT/DUROID  5870,  h  —  1.57 
mm,  er  —  2.33,  tanS  =  0.0012.  (increment  2.5  MHz 
clockwise) 


cellent  As  a  matter  of  comparison,  the  result  obtained  using 
the  infinite  ground  plane  model  has  also  been  included  and 
it  show  that  in  this  extreme  case  the  infinite  ground  plane 
approximation  is  definitely  too  rough. 

4.1  RADIATION  PATTERNS 

Another  interesting  aspect  of  the  excitation  model  derived 
in  this  paper  is  the  prediction  of  the  back  radiation  and 
the  side  lobe  levels  of  microstrip  printed  antennas.  In  the 
present  work  the  far  field  radiated  by  the  structure  has  been 
computed  with  the  aid  of  asymptotic  expressions  for  the 
multilayered  media  Green’s  functions,  valid  for  large  values 
of  source-observer  distances.  These  asymptotic  expressions 
are  based  on  the  use  of  the  saddle  point  method,  which  al¬ 
lows  the  analytical  evaluation  of  a  Fourier  integral  by  just 
considering  the  contribution  of  the  function  at  the  saddle 
point  [Mosig  and  Gardiol  1982].  It  is  important  to  have  in 
mind  that  in  a  multilayered  medium,  horizontal  currents  can 
in  general  produce  both  horizontal  and  longitudinal  (along 
z )  components  of  the  electromagnetic  fields.  This  comes 
from  the  fact  that  the  dyad  associated  with  the  magnetic 
vector  potential  is  not  a  diagonal  dyad,  but  it  rather  contains 


off  diagonal  elements.  For  instance,  if  the  so  called  Som- 
merfeld  choice  is  selected,  then  the  whole  magnetic  vector 
potential  dyad  can  be  written,  for  only  horizontal  currents, 
as  [Mosig  and  Gardiol  1985,  Mosig  1989] 

Ga  —  Gx£  4-  ez  Gz/  j  ex 

+  (&y  GPJt  +  ez  G ^  ey  (16) 

where,  as  already  said,  the  spectral  domain  Green’s 
functions  appearing  in  equation  (16)  are  derived  from 
voltages  and  currents  computed  in  the  equivalent 
transmission  line  network  of  Fig.  1(b),  as  described  in 
[Mosig  and  Gardiol  1988,  Michalski  and  Mosig  1997]. 
For  the  Green’s  functions  of  interest  in  (16)  one  obtains 
[Michalski  and  Mosig  1997] 
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(17) 

where  TE ,  TM  denotes  transverse  electric  and 
transverse  magnetic  (with  respect  to  the  z-axis) 
waves,  and  the  transverse  wavenumbers  are  given 
by  [Mosig  and  Gardiol  1982]:  kp  =  ko  sin#, 

kx  —  —ko  sin#  costp,  ky  =  -ko  sin#  sin tp. 

The  main  difficulty  is  then  reduced  to  the  calculation 
of  these  Green’s  functions  in  the  spatial  domain.  For 
this  purpose  the  inverse  Fourier  integral  is  evaluated 
with  the  saddle  point  technique,  and,  as  shown  in 
[Mosig  and  Gardiol  1982],  one  finally  obtains  in  the 
spatial  domain  the  following  simple  relation 

GJf  =  j  k0  cos(6)  G’X  (18) 

where  s,  t  =  x,y,z,  and  R  is  the  source-observer  distance. 
It  is  important  to  remark  that  for  the  derivation  of  equation 
(18)  the  spectral  domain  Green’s  functions  are  assumed  to 
have  a  free  space  dependence  of  the  type:  exp  (— j  /3  z).  The 
main  implication  of  this  is  that  the  voltages  and  currents  in 
equation  (17)  must  be  computed  at  the  first  air-dielectric  in¬ 
terface  for:  0  <  #  <  7r/2,  and  they  must  be  computed  at  the 
last  air-dielectric  interface  for:  7r/2  <  #  <  7r.  Having  all 
these  computational  details  in  mind,  an  accurate  evaluation 
of  the  radiation  patterns  of  microstrip  antennas  printed  on 
finite  size  ground  planes  has  been  carried  out.  Figs.  10, 
11  and  12.  present  the  measured  and  computed  results 
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(a)  E-plane. 


Figure  10:  Radiation  patterns  of  the  printed  patch  antenna  shown 
in  Fig.  4.  Ground  plane  size:  Wg  =  60  mm,  Lg  =  60 
mm.  Frequency  is  5.020  GHz.  (measurement  repro¬ 
duced  from  [Bokhari  et  al  1992]) 


for  the  E  and  H-plane  radiation  patterns  of  the  antenna 
shown  in  Fig.  4  with  ground  plane  size:  Wg  —  60  ram, 
Lg  =  60  mm,  Wg  =  90  mm  Lg  =  90  mm  and 
Wg  =  180  mm  Lg  =  180  mm  (respectively  Ao  x  Ao, 
1.5A0  x  1.5A0  and  3A0  x  3A0  at  5.02  GHz).  The  results  pre¬ 
sented  indicate  that  the  agreement  is  good,  and  in  particular 
the  predicted  level  of  back  radiation  is  approximately  the 
measured  one.  It  is  important  to  mention  that  a  model  using 
an  infinite  ground  plane  gives  no  information  concerning 
the  level  of  back  radiation  of  the  antenna,  which  is  assumed 
to  be  zero.  On  the  contrary,  with  the  new  excitation  model 
derived  in  this  paper,  an  accurate  estimation  of  the  back  ra¬ 
diation  level  can  be  obtained.  It  must  be  also  pointed  out 
that  the  present  model  still  uses  layered  Green’s  functions 
and  doesn’t  include  neither  the  radiation  of  the  probe  itself 
nor  the  effect  of  the  dielectric  layer  finiteness. 

These  two  aspects  of  the  problem  could  also  be  included 
in  the  model  by  means  of  respectively,  vertical  conduc¬ 
tion  and  polarisation  currents  and  work  towards  this  goal 
is  in  progress.  However  the  results  of  figures  10-12  shows 
clearly  that  the  only  noticeable  improvement  would  be  the 
filling  of  the  deep  nulls  at  ±90°  and  that  except  for  this 
minor  correction,  our  model  in  its  current  status  follows 


(a)  E-plane. 


Figure  11:  Radiation  patterns  of  the  printed  patch  antenna  shown 
in  Fig.  4.  Ground  plane  size:  Wg  =  90  mm,  Lg  =  90 
mm.  Frequency  is  5.020  GHz.  (measurement  repro¬ 
duced  from  [Bokhari  et  al  1992]) 

closely  the  measured  values,  while  still  retaining  a  reason¬ 
able  simplicity  which  would  be  lost  if  the  aforementioned 
effects  are  included. 

5  CONCLUSION 

A  new  excitation  model  for  coaxially  fed  printed  microstrip 
antennas,  developed  in  the  frame  of  the  mixed  potential  in¬ 
tegral  equation  (MPIE)  and  the  method  of  moments  (MoM), 
has  been  presented.  Moreover,  a  modified  version  of  this 
model  allows  the  analysis  of  these  antennas  on  finite  size 
ground  planes.  This  model  has  been  successfully  applied 
to  the  prediction  of  input  impedances  for  patches  above 
ground  planes  whose  size  ranges  from  the  patch  size  to  in¬ 
finity.  With  this  approach,  scattering  from  ground  plane 
edges  can  be  taken  into  account  and  full  range  (including 
backside  scattering)  radiation  patterns  can  also  be  predicted. 
The  paper  has  first  presented  the  theoretical  basis  of  the  new 
derived  excitation  method,  including  the  numerical  details 
needed  for  a  correct  far  field  computation.  Theoretical  re¬ 
sults  have  been  compared  with  measurements,  for  both  the 
input  impedance  and  the  radiation  patterns.  Comparisons 
have  revealed  that  the  accuracy  achieved  with  the  new  ex- 


124 


ACES  JOURNAL,  VOL.  15,  NO.  2,  JULY  2000 


citation  method  is  very  satisfactory,  and  in  particular  the 
backside  radiation  and  side  lobe  levels  of  real  life  printed 
antennas  can  accurately  be  predicted. 


(a)  E-plane. 

Figure  12:  Radiation  patterns  of  the  printed  patch  antenna  shown 
in  Fig.  4.  Ground  plane  size:  Wg  =  180  mm, 
Lg  =  180  mm.  Frequency  is  5.020  GHz.  (measure¬ 
ment  reproduced  from  [Bokhari  et  al.  1992]) 
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PUBLICATION  CRITERIA 

Each  paper  is  required  to  manifest  some  relation  to  applied 
computational  electromagnetics.  Papers  may  address 
general  issues  in  applied  computational  electromagnet¬ 
ics,  or  they  may  focus  on  specific  applications,  tech¬ 
niques,  codes,  or  computational  issues.  While  the 
following  list  is  not  exhaustive,  each  paper  will  generally 
relate  to  at  least  one  of  these  areas: 

1.  Code  validation.  This  is  done  using  internal  checks  or 
experimental,  analytical  or  other  computational  data. 
Measured  data  of  potential  utility  to  code  validation  efforts 
will  also  be  considered  for  publication. 

2.  Code  performance  analysis.  This  usually  involves 
identification  of  numerical  accuracy  or  other  limitations, 
solution  convergence,  numerical  and  physical  modeling 
error,  and  parameter  tradeoffs.  However,  it  is  also 
permissible  to  address  issues  such  as  ease-of-use,  set-up 
time,  run  time,  special  outputs,  or  other  special  features. 

3.  Computational  studies  of  basic  physics.  This  involves 
using  a  code,  algorithm,  or  computational  technique  to 
simulate  reality  in  such  a  way  that  better  or  new  physical 
insight  or  understanding  is  achieved. 

4.  New  computational  techniques,  or  new  applications 
for  existing  computational  techniques  or  codes. 

5.  ’’Tricks  of  the  trade"  in  selecting  and  applying  codes 
and  techniques. 

6.  New  codes,  algorithms,  code  enhancement,  and  code 
fixes.  This  category  is  self-explanatory  but  includes 
significant  changes  to  existing  codes,  such  as  applicability 
extensions,  algorithm  optimization,  problem  correction, 
limitation  removal,  or  other  performance  improvement. 
Note:  Code  (or  algorithm)  capability  descriptions  are 
not  acceptable,  unless  they  contain  sufficient  technical 
material  to  justify  consideration. 

7.  Code  input/output  issues.  This  normally  involves 
innovations  in  input  (such  as  input  geometry 
standardization,  automatic  mesh  generation,  or  computer- 
aided  design)  or  in  output  (whether  it  be  tabular,  graphical, 
statistical,  Fourier- transformed,  or  otherwise  signal- 
processed).  Material  dealing  with  input/output  database 
management,  output  interpretation,  or  other  input/output 
issues  will  also  be  considered  for  publication. 

8.  Computer  hardware  issues.  This  is  the  category  for 
analysis  of  hardware  capabilities  and  limitations  in  meeting 
various  types  of  electromagnetics  computational  require¬ 
ments.  Vector  and  parallel  computational  techniques  and 
implementation  are  of  particular  interest. 


Applications  of  interest  include,  but  are  not  limited  to, 
antennas  (and  their  electromagnetic  environments), 
networks,  static  fields,  radar  cross  section,  shielding, 
radiation  hazards,  biological  effects,  electromagnetic  pulse 
(EMP),  electromagnetic  interference  (EMI),  electromagnet¬ 
ic  compatibility  (EMC),  power  transmission,  charge 
transport,  dielectric  and  magnetic  materials,  microwave 
components,  MMIC  technology,  remote  sensing  and  geo¬ 
physics,  communications  systems,  fiber  optics,  plasmas, 
particle  accelerators,  generators  and  motors,  electromagnet¬ 
ic  wave  propagation,  non- destructive  evaluation,  eddy 
currents,  and  inverse  scattering. 

Techniques  of  interest  include  frequency-domain  and 
time-domain  techniques,  integral  equation  and  differential 
equation  techniques,  diffraction  theories,  physical  optics, 
moment  methods,  fmite  differences  and  finite  element 
techniques,  modal  expansions,  perturbation  methods,  and 
hybrid  methods.  This  list  is  not  exhaustive. 

A  unique  feature  of  the  Journal  is  the  publication  of 
unsuccessful  efforts  in  applied  computational 
electromagnetics.  Publication  of  such  material  provides  a 
means  to  discuss  problem  areas  in  electromagnetic  model¬ 
ing.  Material  representing  an  unsuccessful  application  or 
negative  results  in  computational  electromagnetics  will  be 
considered  for  publication  only  if  a  reasonable  expectation 
of  success  (and  a  reasonable  effort)  are  reflected. 
Moreover,  such  material  must  represent  a  problem  area  of 
potential  interest  to  the  ACES  membership. 

Where  possible  and  appropriate,  authors  are  required  to 
provide  statements  of  quantitative  accuracy  for  measured 
and/or  computed  data.  This  issue  is  discussed  in  "Accuracy 
&  Publication:  Requiring  quantitative  accuracy  statements 
to  accompany  data",  by  E.K.  Miller,  ACES  Newsletter ,  Vol. 
9,  No.  3,  pp.  23-29,  1994,  ISBN  1056-9170. 

EDITORIAL  REVIEW 

In  order  to  ensure  an  appropriate  level  of  quality  control, 
papers  are  refereed.  They  are  reviewed  both  for  technical 
correctness  and  for  adherence  to  the  listed  guidelines 
regarding  information  content.  Authors  should  submit  the 
initial  manuscript  in  draft  form  so  that  any  suggested 
changes  can  be  made  before  the  photo-ready  copy  is 
prepared  for  publication. 
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STYLE  FOR  CAMERA-READY  COPY 

The  ACES  Journal  is  flexible,  within  reason,  in  regard  to 
style.  However,  certain  requirements  are  in  effect: 

1.  The  paper  title  should  NOT  be  placed  on  a  separate 
page.  The  title,  author(s),  abstract,  and  (space  permitting) 
beginning  of  the  paper  itself  should  all  be  on  the  first  page. 
The  title,  author(s),  and  author  affiliations  should  be 
centered  (center-justified)  on  the  first  page. 

2.  An  abstract  is  REQUIRED.  The  abstract  should  state 
the  computer  codes,  computational  techniques,  and 
applications  discussed  in  the  paper  (as  applicable)  and 
should  otherwise  be  usable  by  technical  abstracting  and 
indexing  services. 

3.  Either  British  English  or  American  English  spellings 
may  be  used,  provided  that  each  word  is  spelled 
consistently  throughout  the  paper. 

4.  Any  commonly-accepted  format  for  referencing  is 
permitted,  provided  that  internal  consistency  of  format  is 
maintained.  As  a  guideline  for  authors  who  have  no  other 
preference,  we  recommend  that  references  be  given  by 
author(s)  name  and  year  in  the  body  of  the  paper  (with 
alphabetical  listing  of  all  references  at  the  end  of  the  paper). 
Titles  of  Journals,  monographs,  and  similar  publications 
should  be  in  boldface  or  italic  font  or  should  be  underlined. 
Titles  of  papers  or  articles  should  be  in  quotation  marks. 

5.  Internal  consistency  shall  also  be  maintained  for  other 
elements  of  style,  such  as  equation  numbering.  As  a 
guideline  for  authors  who  have  no  other  preference,  we 
suggest  that  equation  numbers  be  placed  in  parentheses  at 
the  right  column  margin. 

6.  The  intent  and  meaning  of  all  text  must  be  clear.  For 
authors  who  are  NOT  masters  of  the  English  language,  the 
ACES  Editorial  Staff  will  provide  assistance  with  grammar 
(subject  to  clarity  of  intent  and  meaning). 

7.  Unused  space  should  be  minimized.  Sections  and 
subsections  should  not  normally  begin  on  a  new  page. 

MATERIAL,  SUBMITTAL  FORMAT  AND 
PROCEDURE 

The  preferred  format  for  submission  and  subsequent 
review,  is  12  point  font  or  12  cpi,  double  line  spacing  and 
single  column  per  page.  Four  copies  of  all  submissions 
should  be  sent  to  the  Editor-in-Chief  (see  inside  front 
cover).  Each  submission  must  be  accompanied  by  a 
covering  letter.  The  letter  should  include  the  name, 
address,  and  telephone  and/or  fax  number  and/or  e-mail 
address  of  at  least  one  of  the  authors. 


Only  camera-ready  original  copies  are  accepted  for 
publication.  The  term  "camera-ready”  means  that  the 
material  is  neat,  legible,  and  reproducible.  The  preferred 
font  style  is  Times  Roman  10  point  (or  equivalent)  such  as 
that  used  in  this  text.  A  double  column  format  similar  to 
that  used  here  is  preferred.  No  author’s  work  will  be 
turned  down  once  it  has  been  accepted  because  of  an 
inability  to  meet  the  requirements  concerning  fonts  and 
format.  Full  details  are  sent  to  the  author(s)  with  the  letter 
of  acceptance. 

There  is  NO  requirement  for  India  ink  or  for  special  paper; 
any  plain  white  paper  may  be  used.  However,  faded  lines 
on  figures  and  white  streaks  along  fold  lines  should  be 
avoided.  Original  figures  -  even  paste-ups  -  are  preferred 
over  "nth-generation"  photocopies.  These  original  figures 
will  be  returned  if  you  so  request. 

While  ACES  reserves  the  right  to  re-type  any  submitted 
material,  this  is  not  generally  done. 

PUBLICATION  CHARGES 

ACES  members  are  allowed  12  pages  per  paper  without 
charge;  non-members  are  allowed  8  pages  per  paper 
without  charge.  Mandatory  page  charges  of  $75  a  page 
apply  to  all  pages  in  excess  of  12  for  members  or  8  for 
non-members.  Voluntary  page  charges  are  requested  for 
the  free  (12  or  8)  pages,  but  are  NOT  mandatory  or 
required  for  publication.  A  priority  courtesy  guideline, 
which  favors  members,  applies  to  paper  backlogs.  Full 
details  are  available  from  the  Editor-in-Chief. 

COPYRIGHTS  AND  RELEASES 

Each  primary  author  must  sign  a  copyright  form  and  obtain 
a  release  from  his/her  organization  vesting  the  copyright 
with  ACES.  Forms  will  be  provided  by  ACES.  Both  the 
author  and  his/her  organization  are  allowed  to  use  the 
copyrighted  material  freely  for  their  own  private  purposes. 

Permission  is  granted  to  quote  short  passages  and  reproduce 
figures  and  tables  from  an  ACES  Journal  issue  provided  the 
source  is  cited.  Copies  of  ACES  Journal  articles  may  be 
made  in  accordance  with  usage  permitted  by  Sections  107 
or  108  of  the  U.S.  Copyright  Law.  This  consent  does  not 
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reproduction  of  multiple  copies  and  the  use  of  articles  or 
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